perm filename NOTES.JRA[LSP,JRA]8 blob
sn#122570 filedate 1974-09-20 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00053 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00006 00002 .SEC(Evaluation of LISP expressions,Evaluation)
C00018 00003 .SS(Sexpr translation of LISP expressions)
C00025 00004 .SS(A first look at symbol tables,symbol tables,P92:)
C00029 00005 .SS(Introduction to %3λ%*-expressions)
C00034 00006 .SS(%3λ%*-expressions,%3λ%*-expressions,P49:)
C00058 00007 .SS(Mechanization of evaluation,,P6:)
C00069 00008 .SS(Examples of %3eval%*,examples of %3eval%*)
C00073 00009 It should now be clear that %3eval%* and friends %2do%* begin to
C00078 00010 .<<the | | symbol table >>
C00088 00011 .SS(Environments and bindings,,P77:)
C00095 00012 .REQUIRE "PROB8.PUB" SOURCE_FILE
C00096 00013 .SS(%3label%*,%3label%*,P38:)
C00101 00014 .SS(functional arguments,functional argument,P76:)
C00112 00015 .<<pic of funarg execution>>
C00119 00016 .SS(special forms,special form,P8:)
C00125 00017 .SS(The %3prog%*-feature,,P37:)
C00139 00018 Now that assignment statements are out in the open let's reexamine "<=".
C00141 00019 .SS(In retrospect,,P90:)
C00157 00020 .SS(Towards a Mathematical Semantics,,P85:)
C00170 00021 Let us begin to relate these intuitions to our discussion of
C00185 00022 To describe the evaluation of a function-call in LISP we must add
C00190 00023 This will suffice for our current λ-definitions we need not modify %dl%*
C00211 00024 .SEC(Running on the machine)
C00217 00025 .SS(%3evalquote%*,%3evalquote%*)
C00225 00026 .SS(A project: extensions to %3eval%*,%3eval%*,P67:)
C00234 00027 .SS(A project: Syntax-directed processes,syntax-directed)
C00235 00028 .SEC(Toward Implementation,symbol tables)
C00247 00029 .<<atom struct for fact>>
C00258 00030 .<<pic of NIL>>
C00259 00031 .SS(A first look at %3cons%1)
C00271 00032 .SS(%3read%* and %3print%*,,P13:)
C00278 00033 .SS(Hashing,hashing,P14:)
C00285 00034 .SS(A review of the structure of the LISP machine,LISP machine)
C00287 00035 .SS(More on symbol tables,symbol tables)
C00291 00036 .SS(Global Variables,global variables)
C00295 00037 .SS(AMBIT/G,AMBIT/G,P26:)
C00304 00038 .<<pic of AMBIT/G for car>>
C00307 00039 On following pages are AMBIT/G algorithms for →%3cdr%*↔←, →%3cons%*↔←,
C00310 00040 .SS(garbage collection again,garbage collection)
C00317 00041 .CENT(A simple LISP garbage collector in AMBIT/G)
C00318 00042 .CENT(A simple LISP garbage collector written in LISP)
C00323 00043 .CENT(A link-bending garbage collector for LISP)
C00327 00044 .SS(Syntactic dominoes)
C00328 00045 .SS(The Contour Model,Contour Model,P31:)
C00331 00046 .SS(Macros and special forms,macros,P18:)
C00347 00047 .SS(%aSM%*-%3eval%*,%3eval%*,P30:)
C00353 00048 .SS(Binding revisited,bind,P78:)
C00361 00049 .<<PROBS OF BINDING AND FEXPRS>>
C00363 00050 .SS(Review)
C00367 00051 .SS(%aSM%*:A Simple machine,%aSM%*,P33:)
C00374 00052 .SS(An alternative: deep bindings,deep binding)
C00382 00053 .END "JRA"
C00383 ENDMK
C⊗;
.SEC(Evaluation of LISP expressions,Evaluation)
.BEGIN "JRA" TURN ON "←","#";
.SS(Introduction)
%1
We shall now begin to look very closely at the intuitive
process which we have been using in the ⊗→evaluation↔← of LISP expressions.
Why? First in an abstract sense,
that is the business of programming languages--evaluation of functions.
They aren't called procedure-oriented languages for nothing! Also
function evaluation is a basic tool of evaluation of LISP functions.
In fact a very general kind of evaluation process is going on in LISP,
that is evaluation of recursive functions. Later we will
study the mechanisms which will support recursive evaluation.
First, as we mentioned in regard to the polynomial arithmetic
examples the intuitive mathematical operations say nothing about the process
of evaluation. %3(2*3) + (5*6)%* evaluates to %336%* regardless
of whether you evaluate %32*3%* first or %35*6%* first, or whether you do them in parallel.
Probably if one were forced to explicate mathematical
evaluation, we would have to say parallel evaluation.
.P98:
Already on {yon(P17)} we have seen that order of
evaluation in conditional expressions can make a difference,
and certainly when we try to mechanize LISP
function evaluation on a sequential machine we %2will%* have to choose
%2some%* order of evaluation. We must also make some decision regarding
the order of evaluation of the arguments to a function call,
say %3f[x%41%*;x%42%*; ...x%4n%1]. We will assume that we will evaluate the arguments
from left to right. The evaluation scheme which we choose
is called %2⊗→inside-out↔←%* or %2⊗→call-by-value↔←%*. Alternative proposals exist
and have their merits; ⊗→call-by-name↔← (outside-in) evaluation is another
well-known scheme.
Here again, the style of evaluation can make a difference. Consider the example due
to J. Morris:
.P84:
%3f[x;y] <= [x = 0 → 0; T → f[x-1;f[y-2;x]]]]%*. Evaluation of %3f[2;1]%*
will terminate if we always evaluate the leftmost-outermost occurrence of %3f%*.
Thus:
←%3f[2;1] = f[1;f[-1;2]] = f[0;f[f[-1;2]-2;-1] = 0;%*
However if we evaluate the rightmost-innermost occurrences first, the
computation will not terminate:
←%3f[2;1] = f[1;f[-1;2]] = f[1;f[-2;f[0;-1]]] = f[1;f[-2;0]] = ... .%*
Intuitively, call-by-value says:
evaluate the arguments to a function before you attempt to apply the arguments
to the function definition. Here are some examples:
Let %3f[x;y]%* be %3x%82%* + y%1 and consider %3f[3+4; 2*2]%*. Then
call-by-value says evaluate the arguments (%37%*#and#%34%*) associate those
with the formal parameters of %3f%*(i.e. %3x:7 ;y:4%*) and evaluate the body of
%3f%* (i.e.#%37%82%*#+#4#=#57%1); call by name says pass the unevaluated actual
parameters to the function, then perhaps evaluate. I.e. %3(3+4)%82%*#+#2*2%1
arises (which also evaluates to %357%*). We will say more about other styles of
evaluation later. However most of this section will be restricted to call-by-value.
Notice that the process of evaluation by value
is recursive. That is, it goes something like this:
If the expression to be evaluated is a constant then the value
(of the expression) is that constant. (the value of %33%* is %33%*).
If the expression is a variable the see what the current value associated
with that variable is. (Within the evaluation of say
%3f[3;4]%* where %3f[x;y]#<=#x%82%*#+#y %1the current value of the variable
%3x%* is %33%*.) The only other kind of expression that we can have is a function
name followed by arguments. (Say %3f[3;4]%*). In this case we evaluate the arguments,
that is, we %2recur%* and then apply the definition of the function to those
evaluated arguments. How do you apply the evaluated arguments to the function
definition? You associate or ⊗→bind↔← the formal parameters (or variables)
of the definition to the values of the actual parameters.
Then you evaluate the body of the function in this new environment. Why all the emphasis
on the mechanism of function evaluation? For one reason implementation
of programming languages requires careful consideration of this
problem. If you understand the formal model of LISP and the corresponding
model of implementation, you should have an exceptionally
clear picture of the problems of implementation of languages, and you
will understand many of the problems of semantics of languages.
.P80:
How do we transcribe the evaluation of arithmetic functions to LISP functions.
It's easy. First, the class of LISP constants is larger
(than the integers) What are the constants of LISP? They're just the Sexprs.
(The value of %3(A#.#B)%* is %3(A#.#B)%* --- just like the value of %33%* is %33%*
⊗↓We are ignoring the distinction between the %2numeral%* %33%* and the %2number%3 3.%1←).
The additional artifact of LISP which we have to talk about with respect to
evaluation is the conditional expression. But clearly its evaluation can also
be precisely specified. See {yon(P40)}.
In more specific detail here is some of the structure of the
LISP evaluation mechanism:
If the expression to be evaluated is a constant then the value
is that constant.
If the expression is a variable find its value in the current environment.
If the expression is a conditional then it is of the form
%3[p%41%* → e%41%*; p%42%* → e%42%*; ... p%4n%* → e%4n%1]. Evaluate it
using the semantics defined on {yon(P40)}.
If the expression is of the form: function name followed by
arguments then:
.BEGIN TABIT1(5);
\%21.%1 evaluate the arguments (from left to right)
\%22.%1 find the definition of the function
\%23.%1 associate the evaluated arguments with the formal parameters in the function definition
\ That is modify the environment of variables their values.
\%24.%1 evaluate the body of the function.
.END
So we %2can%* describe the outline of a mechanical procedure which
can be used in the evaluation of LISP functions. Recall our discussion
of the differentiation formulas in {yonss(P41)}. We described the mechanism of
differentiating as a recursive process. Then mechanized that scheme
as a LISP function, translating the polynomials to list notation,
since the domain of LISP functions is Sexprs. We are in a similar
situation here. It should be clear that we could write a LISP
function representing the evaluation process provided that we can
find a representation for LISP expressions as Sexpressions.
This mapping of LISP expressions to Sexprs is our first order of business. We
will accomplish this mapping by using an extension of the scheme introduced in
{yonss(p41)}.
This whole process of mapping LISP expressions onto sexprs, and writing a LISP
function to act as an evaluator may seem a bit incestuous or difficult to fathom.
First, it is no more obscure than the differentiation example.
It is just another instance of the diagram of {yon(P46)},
only now we are applying the process to LISP itself. The effect is to force
us to make precise exactly what is meant by LISP evaluation. This
precision will have many important ramifications.
Second, we've been doing evaluation of sexpr representations of
LISP expressions already.
The %2great mother of all functions%* is exactly the evaluation mechanism for
the LISP primitive functions and predicates, %3car, cdr, cons, atom%* and %3eq%*
when restricted to functional composition and constant arguments.
The %2great mother revisited%* is the extension to conditional expressions.
But let's stop for some examples of translating LISP functions into Sexpr notation.
.SS(Sexpr translation of LISP expressions)
.BEGIN NOFILL;TABS 25,50;TURN ON "\";
.GROUP
\%2LISP expression\translation as Sexpr
%1we will represent numbers just as numbers, e.g.:
\%32\2
%1we will translate identifiers to their upper-case counterpart. Thus:
%3\x\X
\y2\Y2
\car\CAR
.APART
.BEGIN FILL;
%1Now we've got a problem:
We wish to map arbitrary LISP expressions to Sexpressions. The LISP expression,
%3x%* translates to %3X%*; %3X%* is itself a LISP expression (a constant);
%2it%* must also have a translate.
We must be a little careful here.
When we write Son of Great Mother we will give it an sexpr representation
of a form to be evaluated. We might give it the representation of %3car[x]%*
in which case the value computed will depend on the current value bound to %3x%*.
We might also give the representation of %3car[X]%*; in this case we should
expect to be presented with an error message.
Or, for example some function %3foo%* we wish to write may return the sexpr
representation of a LISP form as its value.
Say %3foo[1]%* returns the representation of %3car[x]%* and %3foo[2]%* returns
the representation of %3car[X]%*. We must be able to distinguish between these
representations.
That is, given the representation,
there should be exactly one way of interpreting it as a LISP expression.
The mapping must be 1-1. So we must represent %3x%* and %3X%* as %2different%* sexprs.
The translation scheme we pick is:
for any sexpression, y, its translation is %3(⊗→%3QUOTE%*↔←%1 y%3)%1.
.END
.GROUP
.SELECT 1;
For example:
\%3X\(QUOTE X)
\(A . B)\(QUOTE (A . B))
\QUOTE\(QUOTE QUOTE)
.APART
.GROUP
%1
.BEGIN FILL;
We must also show how to map expressions of the form %3f[e%41%*#;#...#;e%4n%*]%1
onto sexprs.
We have already seen one satisfactory mapping for functions in prefix form
in {yonss(P41)}.
We will use that mapping (called Cambridge Polish) here. That is:
.END
\%3f[e%41%*;e%42%*; ...e%4n%*]\(F e%41%*' e%42%*' ...e%4n%1')
\\ where %3e%4i%1' is the translate of %3e%4i%3
%1Examples:%*
\car[x]\(CAR X)
\car[X]\(CAR (QUOTE X))
\cons[cdr[(A . B)];x]\(CONS (CDR (QUOTE (A . B))) X)
.APART
.GROUP
%1What's left? conditional expressions. The mapping is self-explanatory.
\%3[p%41%* → e%41%*; ... p%4n%* → e%4n%*]\(⊗→%3COND%*↔← (p%41%*' e%41%*') ... (p%4n%*' e%4n%*'))
%1and an example:%3
\[atom[x] →1; q[y] → X]\(COND ((ATOM X) 1) ((Q Y) (QUOTE X)))
%1
.APART
.END
.P44:
Notice that %3(COND#(#...#))%* and %3(QUOTE#...#)%* %2look%* like translates of function
calls of the form %3cond[#...#] %* and %3quote[#...#]%*. This is %2not%* the case,
%3QUOTE%* and %3COND%* are not translates of LISP functions.
The constructs %3QUOTE%* and %3COND%* do not use call-by-value evaluation and
are handled in a special way as we shall soon see.
Since %3T%* and %3NIL%* are used so frequently, we chose their
translates to be %3T%* and %3NIL%* rather than %3(QUOTE#T)%* and %3(QUOTE#NIL)%*.
You should now go back and look at the %2tgm%*'s ({yonss(P39)}) again now
that you know
what they mean. You should note that with respect to atoms, the great
mothers are only defined for %3T%* and %3NIL%*. Other atoms (which are
the translates of variables) are not recognized and return a h.s. message.
The question of variables requires some careful study. Before we do that,
let's see what the great mothers actually accomplish.
First, tgm's explicitly tell you what the meaning of various subsets
of LISP is. That is, the tgms explicitly codify the semantics (or meaning)
of LISP. Next, if you are able to code tgm on a machine, then you can
input the Sexpr form of LISP expressions and they will be evaluated for you.
%2Great mother does it all.%*
The part of our informal argument which is clearly missing
in the tgms is the details of handling variables and the names of functions. That is,
how do we describe the binding of variables to values as a mechanical process
and, given the name of a defined function, how do we retrieve the actual definition?
These are actually the same problem: that of symbol table manipulations.
.SS(A first look at symbol tables,symbol tables,P92:)
%1
One of the distinguishing features of Computer Science is
its reliance on the ability to store and recover information.
Any formalism which proports to address itself to Computer Science
must take this into account. In particular we must be able to
handle the effect of assignment or binding, that is, the association
of a value with a name in an environment. A common device used to accomplish this
is a symbol table. In essence, a symbol table is simply a set of ordered pairs
of objects (a function or relation, if you wish). One of the
elements of each pair is a name; the other is considered to be
a value. We will come back to symbol tables in a while to study
the problems of efficent storage and retrieval of information.
It will suffice now simply to think of a symbol table as represented
in LISP by a list of dotted pairs --- name dotted with value---.
These simple symbol tables are also know as ⊗→association lists↔←
or %2⊗→a-lists↔←%*.
Since we are encoding the %2set%* of objects as a list, we are actually
encoding it as a %2sequence%*. It will be quite useful to exploit this
ordering imposed in the sequence.
.GROUP
Recall that we are representing variables as atoms; then if %3x, y, %*and %3z%*
had current values %32, 3, %*and %34%*, then a symbol table describing that fact could
be encoded as:
←%3((X . 2)(Y . 3)(Z . 4)) .%1
.APART
.GROUP
In the sequel, we will need to add elements to a table and, given a name, find
the corresponding value.
Here's a simple symbol-table-searching function, ⊗→%3assoc%*↔←:
.BEGIN TABS 10,25;TURN ON "\";NOFILL;
%3
\assoc[x;l] <= \[null[l] → NIL;
\\ eq[caar[l];x] → car[l];
\\ T → assoc[x;cdr[l]]]
%1e.g.%*
←assoc[Y;((X . 2)(Y . 3)(Z . 4))] = (Y . 3)
←assoc[U;((X . 2)(Y . 3)(Z . 4))] = NIL
.END
.APART
%1
%3assoc%* will examine the table for left to right, looking for the first
pair whose %3car%*-part matches the given atom. If such a pair is found,
then that pair is returned. If %2no%* such pair is found, the
result is %3NIL%*. Notice there may be more than one pair with the same
%3car%*-part; only the %2first%* is found.
.SS(Introduction to %3λ%*-expressions)
How about adding things to tables? We will see that %3cons%* plus
the recursion mechanism will automatically update the symbol table
for %2Great-mother%* when we call a function.
We will be adding entries to the symbol table when we call a function,
binding the formal parameters to the actual parameters. Intuitively,
when we call %3f[x;y] <= (x*y + y) %1in%3 f[2;3]%1, the dotted pairs
%3(X . 2) %*and %3 (Y . 3)%* should find their way into the table.
Besides the binding of formal parameters to actual values, there is
another binding process occurring in LISP.
Before we can call the function named %3f%*, its definition must
be in the symbol table; so there should be an entry (dotted pair)
with name-part %3F%* and a value-part representing %3(x*y + y)%*. An obvious
representation is perhaps,
←%3(PLUS (TIMES X Y)Y)%*
That's not quite good enough. If we simply associate the name %3F%*
with the list %3(PLUS#(TIMES#X#Y)#Y)%*, we will have lost some crucial information.
Namely, the relation between the variables %3x%* and %3y%* and the
body of the function will not be present. For example, %3f[y;x]#<=#(x*y#+y)%*
is %2not%*
the same function, but in our naive translation they have the same
representation. The solution is obvious: we must associate the list of
formal parameters with the body of the definition. Perhaps,
←%3((X Y)(PLUS (TIMES X Y) Y)).%*
This is the right approach and it can be made precise.
There is a very elegant and beautiful theory behind this intuitive
discussion. We will sketch a few parts which will be helpful, but
first let's see where we are:
.SS(Review of evaluation)
We have seen that it is possible to mechanize at least one
scheme for evaluation of functions -- left-to-right and call-by-value.
We have seen that it is possible to translate LISP expressions
into Sexprs in such a way that we can write a LISP function
which will act as an evaluator for such translates. In the process
we have had to mechanize the intuitive devices we (as humans) might
use to recall the definition of functions and to recall the current
values of variables. It was soon clear that the same mechanism
could be used: that of symbol tables. To associate a variable with
a value was easy: to associate a function name with its definition
required some care. That is, part of the definiton of a function
involves the proper association of formal parameters with the
body of the definition. (We actually need to be a bit more
precise than this in describing recursive functions, but this is
good enough for now.) The device we chose is called the %2lambda
notation%*.
.SS(%3λ%*-expressions,%3λ%*-expressions,P49:)
The λ-notation is a device invented by the logician, Alonzo Church, in
conjunction with his studies in logic and foundations of mathematics.
The λ-calculus is useful for a careful discussion of functions
and is therefore applicable in a purified discussion of procedures in programming
languages.
The notation was introduced into Computer Science by John McCarthy in
the description of LISP.
In the interest of precision we should note that there are actually several
important distinctions between Church's λ-calculus and the notation envisioned
by McCarthy. We will point out a few such discrepancies at the end of this
section.
.BEGIN TURN ON "←";
As we intimiated previously, the λ-notation as used in LISP, allows
us to attach names to functions in an unambiguous manner.
We have been very imprecise when writing %3f[x;y]#<=#(x*y#+#y)%* as
a definition of the function %3f%*.
First note that %3f[x;y]%* is %2not%* a function, %3f%* is.
To see what %3f[x;y]%* means consider the following example.
When we are asked to evaluate %3car[(A . B)]%* we say the value is %3A%*.
%3car[(A . B)]%* is an expression to be evaluated; it is a LISP ⊗→form↔←.
If %3car[(A . B)]%* %2is%* a form then so is %3car[x]%*;
only now the value of the
form depends on the current value assigned to the variable %3x%*.
So the %2⊗→function↔←%* is %3car%*; the %2form%* is %3car[x]%*.
The function is %3f%*; %3f[x;y]%* is a form, as is %3x*y#+#y%*.
Also our notation has really been specifying more than just the name.
The notation specifies the formal parameters (%3x%* and %3y%*) and
the order in which we are to associate actual parameters in a call
with the formals of the definition (%3x%* with the first, %3y%* with the second).
More subtly, the notation tells %2which%* variables in the function body
are bound variables. That is, which variables are to expect values when the
function is called. For example define %3g[x]#<=#(x*y#+#y)%*; then the
value of %3g[2]%* is well defined and is itself a function.
We wish to have a notation to assign all this other information with the
function body so that function definitions can be inserted into the symbol
table as "values". They will be parametric values, but they will be values.
The λ-notation performs this task by preceeding the function body with
a list of the bound variables, called %2⊗→lambda variables↔←%*.
The list of variables is usually referred to as the %2⊗→lambda list↔←%*.
Then this resulting construct is preceeded by "λ[" and followed by "]".
Thus using the above example, the function %3f%* is exactly the
same function as %3λ[[x;y]#x*y#+#y]%*.
We actually need a bit more than
λ-notation to specify recursive functions in a natural manner. See {yonss(P38)}.
The λ-notation introduces nothing new as far as our intuitive binding and
evaluation processes are concerned, it only makes these operations more
precise. So to evaluate a λ-expression, bind the evaluated actual parameters
to the λ-variables and evaluate the function body. Still, evaluation requires
the usual care. For example, is %3λ[[x]2]%* the constant function which always
gives value %32%*? No, it isn't; the evaluation of an expression (form) involving
this function requires the evaluation of the actual parameter associated with %3x%*.
That computation may not terminate.
Since we intend to include λ-expressions in our language we must include
a translation of them into sexpression form:
.BEGIN GROUP;TABIT2(25,50);
\%2LISP expression\translation as Sexpr%*
\%3λ[[x%41%*; ...; x%4n%*]%9x%*]\(LAMBDA (X%41%* ... X%4n%*) %1translate of %9x%3)
.END
***fix the following; it is BAD!**
Thus the character, λ, will be translated to %3LAMBDA%*. %3LAMBDA%* is therefore a
kind of prefix operator (like %3COND%*) which allows the evaluator to recognize
the occurrence of a function definition (just as %3COND%* is used by the
evaluator to recognize the occurence of a (translate of a) conditional
expression).
Here are some examples of %3λ%*-expressions and their evaluation,
followed by some examples of the translation scheme:
.BEGIN NOFILL TABS 10;TURN ON "\";
\%2a. %3λ[[x;y] x%82%* +y][2;3] = 2%82%* + 3 = 7
\%2b. %3λ[[x;y] cons[car[x];y]][(A . B); C] = (A . C)
\%2c. %3λ[[y] cdr[y]][(A B)] = (B)
\%2d. %3λ[[x] car[x]][λ[[y] cdr[y]][(A B)]]
\ = λ[[x] car[x]][(B)]
\ = B
\%2e. %3λ[[x;y] x%82%* +y] ==> (LAMBDA(X Y)(PLUS (EXPT X 2) Y))
\%2f. %3λ[[x;y] cons[car[x];y]] ==> (LAMBDA(X Y)(CONS (CAR X) Y))
.END
******MORE*******
Our LISP syntax equations should now be augmented to include:
.BEGIN TABIT1(11);GROUP;
<function>\::= λ[<varlist><form>]
<varlist>\::= [<variable>; ... ; <variable>]
\::= [ ]
.END
One of the other more valuable application of λ-notation occurs in that
nebulous area called efficiency. Efficiency is like structured programming --
difficult to define but recognizable when it occurs.
Consider the following sketch of a function definition:
←%3g <= λ[[x][p[lic[x]] → lic[x]; .... x ...]]%*.
Where %3lic%* may be a %2l%*ong %2i%*nvolved %2c%*alculation.
We certainly must compute %3lic[x]%* %2once%*. But as %3g%* is defined,
we would compute %3lic[x]%* %2again%* as e%41%* if p%41%* is true. Since our
current LISP subset has no side effects, this second calculation
is unnecessary. %3g%* is inefficient. Using λ-expressions, in a style
called %2⊗→internal lambdas↔←%* we can improve %3g%* as follows:
Replace the body of %3g%* with,
←%3λ[y][p[y] → y; ... x ...][lic[x]]%*. (*1*)
Call this new function %3g'%*:
.BEGIN TABIT2(10,18);GROUP;
\%3g' <= λ[[x]
\\λ[y][p[y] → y; ... x ... ][lic[x]]
\\ ] %1.
.END
Now when %3g'%* is called we evaluate the actual parameter, binding it
to %3x%*, as always; and evaluate the body, (*1*). Evaluation of (*1*)
involves calculation of %3lic[x]%* (%2once%*), binding the result to
%3y%*. We then evaluate the body of the conditional expression as before.
If p%41%* %2is%* true, then this definition of %3g'%* involves
one calculation of %3lic[x]%* and two table look-ups (for the value of %3y%*),
rather than the two calculations of %3lic[x]%* in %3g%*.
More conventional programming languages can obtain the same effect as
this use of internal lambdas by assignment of %3lic[x]%* to an
temporary variable. We will introduce assignment statements in LISP in
{yonss(P37)}.
.END
We complete this section with a brief discussion of the pure λ-calculus
as compared to LISP's application. In the interest of readability, we will
write λ-calculus expressions in a Gothic bold-face type; e.g. %9λ%* and %dx%*
instead of LISP's %3λ%* and %3x%*.
The syntax of (well-formed) %9λ%*-expressions is simple:
.BEGIN TABIT1(10);GROUP;
<wfe> \::= %9λ%* <variable> %d.%* <wfe>
\::= <applic>
<applic>\::= <applic><atom>
\::= <atom>
<atom>\::= <variable>
\::= <constant>
\::= %d(%* <wfe> %d)%*
.END
The interpretation of a %9λ%*-expression is given in terms of simplification
or conversion rules. To present these rules we need a few definitions.
First a variable, %Dx%*, is a %2⊗→free variable↔←%* in a <wfe>, %dE%* if:
.BEGIN INDENT 0,10;GROUP;
%dE%* is the variable, %dx%*.
%dE%* is an application, %d(OP A)%*, and %dx%* is free in %dOP%* and %dA%*.
%dE%* is a %9λ%*-expression, %9λ%dy.M%1, if %dx%* is free in %dM%* and %dx%* is
distinct from %dy%*.
.END
The second definition says a variable is a %2⊗→bound variable↔←%* in %dE%*
if it occurs in %dE%* and is not free in %dE%*.
Finally some notation: %dE[...]%* means that %d...%* is a "well-formed"
sub-expression in %dE%*.
There are three %9λ%*-conversion rules which are of interest here:
.BEGIN INDENT 0,10;
.GROUP;
%9α%*-conversion: %dE[%9λ%*x.M]%1 may be converted to %dE[%9λ%*y.M']%1 if %dy%* is not
free in %dM%* and %dM'%* results from %dM%* by replacing every free occurrence
of %dx%* by %dy%*. %9α%*-conversion allows alphabetic change of free variables.
.APART;
.GROUP;
%9β%1-reduction: %dE[(%9λ%*x.M)N]%1 %9β%*-reducible if no variable which occurs free in %dN%*
is bound in %dM%*. %dE[(%9λ%*x.M)N]%1 is reducible to %dE[M']%* where %dM'%* results
from the replacement of all free occurrences of %dx%* in %dM%* by %dN%*.
Compare call-by-name on {yon(P98)}.
.APART;
.GROUP
%9∂%1-reduction: We may desire a set of %9∂%*-rules which are associated with the
constants appearing in our %9λ%*-language. Each rule has the basic interpretation:
"%dE%* may always be replaced by %dE'%*." The constants and %9∂%*-reduction
rules are used to construct a %9λ%*-calculus for a specific domain of interest.
.APART
.END
Finally we will say that a %9λ%*-expression is in (%9β%*, %9∂%*)
%2normal form%* if
it contains no expression reducible by %9β%*, or %9∂%* rules.
%9α%*-variants are ignored.
This discussion of %9λ%*-calculi is not meant to be definitive; rather it should
convey the spirit of the subject. The discussion is complete enough, however,
to suggest some interesting problems for language design. First, it is apparent
that there may well be two or more sequences of reductions for a %9λ%*-expression;
is there any reason to believe that these reduction sequences will yield
the same normal forms?
Second, if we have two %9λ%*-expressions which reduce
to distinct normal forms, is there any reason to believe that they are
"inherently different" %9λ%*-expressions?
The first question is easier to answer. As we have seen in LISP, the order
in which calculations are performed can make a difference in the final
outcome. So too in %9λ%*-calculi. There, however, we can
show that if both reduction sequences terminate then they
result in the same normal form. This is usually called the Church-Rosser
theorem.
The second question obviously requires some explanation of "inherently different".
At one level we might say that by definition, expressions with different
normal forms are "inherently different". But what should we say if we could
give an interpretation to the %9λ%*-calculus such that the meaning of two specific
expressions is indistinguishable but the expressions have distinct normal forms?
Thus in a strong sense the expressions %2are%* equivalent but the reduction
rules are not sufficient to display this equivalence. This, indeed, is the
state of affairs. D. Scott %2has%* exhibited a model or interpretation of
the %9λ%*-calculus, and D. Park has shown the equivalence in this model
of two %9λ%*-expressions which do have distinct normal forms.
What does this discussion have to do with language design? Clearly the order of
evaluation or reduction is directly applicable. What about the second question?
This is related to the problem of characterization of a language. Do you say that
the language specification consists of a syntactic component and some
(hopefully precise) description of the evaluation of constructs in that language?
Or do you say that these two components, syntax and a machine, are merely
devices for describing or formalizing notions about some abstract domain of
discourse? The study of formal systems in mathematical logic offers a parallel.
There you are presented with a syntax and a system of axioms and rules of inference.
Most usually we are also offered a "model theory" which gives us models or
interpretations for the syntactic formal system; the model theory usually supplies
additional means of giving convincing arguments for the validity of statements
in the formal system. Indeed, the arguments made within the formal system
are couched in terms of "provability" whereas arguments of the model theory are
given in terms of "truth".
C. W. Morris named these three perspectives on language, syntax, pragmatics,
and semantics. I.e.
.BEGIN INDENT 0,10;GROUP;
%2Syntax%*: The synthesis and analysis of sentences in a language. This area is
well cultivated in programming language specification.
.APART
.GROUP
%2Pragmatics%*: The relation between the language and its user.
Evaluators (like %3tgmoaf, tgmoafr, ...%*) come under the heading of pragmatics.
Pragmatics are more commonly referred to as operational semantics in discussions
of programming languages.
.APART
.GROUP
%2Semantics%*: the relation between constructs of the language and the abstract
objects which they denote. This subdivision is commonly referred to as denotational
semantics.
.END
Put differently, syntax describes appearance, pragmatics describes value, and
semantics describes meaning.
Using this classification scheme, most languages are described using syntax, and
a half-hearted pragmatics; seldom does the question of denotation arise.
LISP was described by a syntax and a precise pragmatics; the %9λ%*-calculus
has a simple syntax and precise pragmatics. The work of D. Scott supplies
a semantics for the %9λ%*-calculus; the work of M. Gordon adapts Scott's work
to LISP.
Rest assured that we are not persuing formalization simply as an
end in itself. Mathematical notation is no substitute for clear thought,
but we believe careful study of semantics will lead to additional
insights in language design ⊗↓R. D. Tennent has invoked this approach in
the design of %3QUEST%*.←.
Though the syntax and pragmatics of the %9λ%*-calculus is quite simple,
and it is a most productive vehicle when used to discuss semantics.
As we said at the beginning of the section, this calculus was
intended to explicate the idea of "function" and is therefore a
rarefied discussion of "procedure".
A through discussion of the models of the %9λ%*-calculus is beyond the scope of this book,
a few implications are worth mentioning.
If a %9λ%*-expression is to denote a function then the standard set-theoretic
notion of functions being "sets of ordered pairs" is not sufficient.
The conversion rules of the calculus allow a %9λ%*-expression to be
applied to any %9λ%*-expression including the expression %2itself%*.
There are well-known logical difficultiies ⊗↓Russell's paradox.← if we allow
unrestricted self-application. LISP and other programming languages allow
procedures to pass procedures as arguments. See {yonss(P76)} for a discussion
of LISP's functional arguments. Even self-application is a viable tool in
a programming language ⊗↓See the problem on {yon(P99)} for an example.←.
The difficulty then, in describing a model or interpretation, is how to
delimit self-application so as to retain consistency. This is one of the
important and advanced aspects of D. Scott's work .
LISP and mjcg
.CENT(Problems);
I What is the difference between %3λ[[ ] x*y + y]%* and %3x*y + y%*?
II Write a LISP predicate, %3free <= λ[[x;e]..]%*, which will give %3T%*
just in the case that %3x%* and %3e%* represent a variable and a %9λ%*-expression
respectively, and %3x%* is "free" in %3e%*.
****more probs****
.SS(Mechanization of evaluation,,P6:)
%1
As we have said, %3tgmoaf%* and %3tgmoafr%* ({yonss(P39)}) are evaluators
for subsets of LISP.
Armed with your symbol-table mechanism we could now extend the
%2great-mothers%* to handle variable look-ups.
Rather than do this we will display our burgeoning cleverness
and make a total revision of the structure of the
evaluators. In making the revision, the
following points should be remembered:
1. Expressions (to be evaluated) can contain variables, both simple
variables and variables naming λ-expressions. Thus evaluation must be done
with respect to an environment or symbol table. We wish to recognize other
(translates of) function names besides %3CAR, CDR, CONS, EQ%* and%3 ATOM%*
in our evaluator, but
explicitly adding new definitions to the evaluator in the style of the
recognizers for the five primitives is a bad idea. Our symbol table should
hold the function definitions and the evaluator should contain the general
schemes for finding the definitions, binding variables to values, and
evaluating the function body. By now you should be convinced that this
process is a reasonably well defined task and could be written in LISP.
2. All %2function%* calls are to be evaluated by-value. However there are
some %2⊗→special forms↔←%* we have seen which are not evaluated in the
normal manner. In particular, conditional expressions, quoted expressions,
and lambda expressions are handled differently.
Abstractly, we might write ⊗→%3sogm%*↔← (son of great mother) as:
.BEGIN SELECT 3;GROUP;TABIT1(7);
sogm <= λ[[exp;environ]
\[isvar[exp] → value[exp;environ];
\ isconst[exp] → denote[exp];
\ iscond[exp] → evcond[exp;environ];
\ isfunc+args[exp] → applyfun[func[exp];list_of_sogmed_args[arg[exp];environ];environ]]
.END
.BEGIN SELECT 3;GROUP;TABIT1(15);
%1where:%*
applyfn <= λ[[fn;args,environ]
\[iscar[fn] → car[arg%41%*[args]];
\ iscons[fn] → cons[arg%41%*[args];arg%42%*[args]];
\ .... ....
\ islambda[fn] → sogm[body[fn];newenviron[vars[fn];args;environ]]
\ .... ..... ]]
.GROUP
%1and:%*
denote <= λ[[exp]
\[isnumber[exp] → exp; %1(see footnote {yon(P80)})%3
\ istruth[exp] → exp;
\ isfalse[exp] → exp;
\ isSexpr[exp] → rep[exp]]
.END
%3list_of_sogmed_args%* is a function to perform left-to-right evaluation of the
argument list. %3newenviron%* is a function to manufacture a new environment
reflecting a rebinding of the λ-variables to the new values. %3func%* is a
selector to pick out the function from the representation of the form; %3arg%*
selects the argument list from the representation.
This abstract sketch is not meant to be complete, but rather only to
whet the intuition.
With this short introduction we will now write a more general evaluator
which will handle a larger subset of LISP than the %3tgm%*s.
The primary function is named ⊗→%3eval%*↔← (rather than Son of Great Mother).
It will take two arguments; the first
is a representation of an expression to be evaluated, and the second
is a representation of a symbol table. %3eval%* will recognize numbers, the
constants %3T%* and %3NIL%*, and if presented with a variable, will attempt to
find the value of the variable in the symbol table.
%3eval%* also handles the previously mentioned special forms. If %3eval%* sees
a ⊗→conditional expression↔← (represented by %3(COND ...)%* ) then the body
of the ⊗→%3COND%*↔← is passed to a subfunction named ⊗→%3evcond%*↔←.
%3evcond%* encodes the semantics as described on {yon(P40)}. The special
form, ⊗→%3QUOTE%*↔←, signifies the occurrence of a constant. Simply return
that constant.
As far as this %3eval%* is concerned, any
other expression is a function-call. In this case we %2apply%* the function
to the list of evaluated arguments. The function may either be a name or
it may be a λ-expression. %3apply%* handles both cases.
.BEGIN NOFILL;TURN ON "\";TABS 16,27;
.GROUP
Here's the new %3eval%*:
%3
⊗→eval↔← <= λ[[e;a]\[atom[e] →\[numberp[e] → e;
\\ eq[e;T] → T;
\\ eq[e;NIL] → NIL;
\\ T → cdr[assoc[e;a]]];
\ atom[car[e]] →\[eq[car[e];QUOTE] → cadr[e];
\\ eq[car[e];COND] → evcond[cdr[e];a];
\\ T → apply[car[e];evlis[cdr[e];a];a]]
\ T → apply[car[e];evlis[cdr[e];a];a]]]
.APART
.GROUP
%1where:%*
evcond <= λ[[e;a][eval[caar[e];a] → eval[cadar[e];a];
\ T → evcond[cdr[e];a]]]
.APART
.GROUP
%1and,%*
evlis <= λ[[e;a][null[e] → NIL;
\ T → cons[eval[car[e];a];evlis[cdr[e];a]]]]
.APART
%1
.END
The subfunctions, %3evcond%* and %3evlis%*, are simple. %3evcond %*you've seen
before in %3tgmoafr%*; %3evlis%* simply manufactures a new list consisting of the
results of evaluating (from left to right) the elements of %3e%*, using the
symbol table, %3a%*, where necessary.
The interesting innovations appear in the function, %3apply%*.
⊗→%3apply%*↔← takes three arguments: a function, a list of evaluated arguments,
and a symbol table. %3apply%* explicitly recognizes the five primitive functions %3CAR,
CDR, CONS, EQ, %*and%3 ATOM%*. If the function name is a variable,
(represented as an atom) then the
definition is located in the symbol table by %3eval%* and applied to the arguments.
Otherwise the function must be a λ-expression. This is where things
get interesting. We know what we must do: evaluate the body of the
λ-expression after binding the formal parameters of the λ-expression
to the evaluated arguments. How do we ⊗→bind↔←? We add variable-value
pairs to the symbol table. We will define a subfunction, %3pairlis%*, to
perform the binding. Then all that is left to do is give the function
body and the new symbol table to %3eval%*. Here is %3apply%*:
.BEGIN NOFILL;TURN ON "\";TABS 20,28;
.GROUP
%3
.P69:
⊗→apply↔← <= λ[[fn;x;a][atom[fn] → [\eq[fn;CAR] → caar[x];
\\eq[fn;CDR] → cdar[x];
\\eq[fn;CONS] → cons[car[x];cadr[x]];
\\eq[fn;ATOM] → atom[car[x]];
\\eq[fn;EQ] → eq[car[x];cadr[x]];
\\T → apply[eval[fn;a];x;a]];
\eq[car[fn];LAMBDA] → eval[caddr[fn];pairlis[cadr[fn];x;a]] ]]
.APART
.GROUP
pairlis <= λ[[v;w;x][null[v] → x;
\T → cons[cons[car[v];car[w]];pairlis[cdr[v];cdr[w];x]]]]
.APART
%1
.END
So the only really new development is in the λ-binding process.
Notice that %3pairlis%* makes a new symbol table by consing new pairs onto
the front of the old symbol table; and recall that %3assoc%* finds the
%2first%* matching pair in the symbol table. %2This is important.%*
.P86:
To summarize then, the evaluation of an expression %3f[a%41%*; ...a%4n%*]%1
is the same as the result of applying %3eval%* to the sexpr translation
%3(F a%41%** a%4n%**) %1where %3a%4i%1* is the translate of %3a%4i%1.
This specification of the semantics of LISP using %3eval%* and %3apply%*
is one of the most interesting developments of Computer Science.
.BEGIN CENTERIT;GROUP;
←%2Problems
I %1Complete the abstract versions of %3sogm%* and Co.
.END
.SS(Examples of %3eval%*,examples of %3eval%*)
After this onslaught, some examples are in order.
Let's evaluate %3f[2;3]%* where %3f <= λ[[x;y] x%82%* + y].%1
After appropriate translation this is equivalent to evaluating:
←%3eval[(F 2 3);((F . (LAMBDA(X Y)(PLUS (EXPT X 2)Y))))]%*
Notes:
%21.%* %3((F . (LAMBDA (X Y) ... ))) = ((F LAMBDA (X Y) ... )) %*
%22.%* Since the symbol table, %3((F ...))%* occurs so frequently in the following
trace, we will abbreviate it as %3st%*. Note that we have no mechanism yet
for permanently increasing the repetoire of known functions. We must
resort to the subterfuge of initializing the symbol table to get the function
%3f%* defined.
%23.%* For this example we must assume that + and ↑ (exponentiation) are known functions. Thus
apply would have to contain recognizers for %3PLUS%* and %3TIMES%*:
.BEGIN NOFILL;
.GROUP
%3
...atom[fn] → [eq[fn];PLUS] → +[car[x];cadr[x]];
eq[fn;EXPT] → ↑[car[x];cadr[x]];
...
.APART
%1
.END
.BEGIN NOFILL;TABS 10,25;TURN ON "\";
.GROUP
So %3eval[(F 2 3);st]
\= apply[car[(F 2 3)]; evlis[cdr[(F 2 3)];st];st]
\= apply[F;evlis[(2 3);st];st]
\= apply[F;(2 3);st]
\= apply[eval[F;st];(2 3);st]
\= apply[(LAMBDA(X Y)(PLUS (EXPT X 2) Y)); (2 3);st]
\= eval[caddr[(LAMBDA(X Y)(PLUS (EXPT X 2) Y))];
\ pairlis[cadr[(LAMBDA(X Y)(PLUS (EXPT X 2) Y))];(2 3);st]]
\= eval[(PLUS (EXPT X 2) Y);pairlis[(X Y);(2 3);st]]
\= eval[(PLUS (EXPT X 2) Y);((X . 2)(Y . 3)(F LAMBDA(X Y) ...))]
\= apply [PLUS; evlis[((EXPT X 2) Y);((X . 2)(Y . 3)..)];((X . 2)...)]
\ %1Let's do a little of:%3 evlis[((EXPT X 2)Y);((X . 2)(Y . 3)...)]
\\= cons[eval[(EXPT X 2);((X . 2)(Y . 3) ...)];
\\ evlis[(Y);((X . 2) ...)]]
\\= cons[apply[EXPT;evlis[(X 2);((X . 2)...)];((X . 2) ...]
\\ evlis[(Y); ...]]
\\= cons[apply[EXPT;(2 2);((X . 2) ...];evlis[(Y); ...]]
\\= cons[↑[car[(2 2)];cadr[(2 2)]];evlis[(Y); ... ]]
\\= cons[↑[2;2];evlis[(Y); ... ]]
\\= cons[4;evlis[(Y);((X . 2)(Y . 3) ....)]]
\\= cons[4;cons[eval[Y;((X .2) ...)]; evlis[NIL;(( ...))]]]
\\= cons[4;cons[3;NIL]]
\\= (4 3) %1 which you knew anyway.
Now back to apply:
%3
\= apply[PLUS;(4 3);((X . 2)(Y . 3) ... )]
\= +[4;3]
\= 7 %1which you also knew.
.APART
.END
It should now be clear that %3eval%* and friends %2do%* begin to
perform as you would expect. It perhaps is not clear that a
simpler scheme might do as well. In particular, the complexity
of the symbol table mechanism which we claimed was so important
has not been exploited. The next example will indeed show that a
scheme like ours is necessary to keep track of variable bindings.
.GROUP
Let's sketch the evaluation of %3fact[3]%* where:
←%3fact <= λ[[x][x = 0 → 1;T → *[x;fact[x-1]]]].%*
.P42:
That is, %3eval[(FACT 3);st] %*where %3st%* names the initial symbol table,
←%3((FACT LAMBDA(X)(COND((ZEROP X)1)(T(TIMES X(FACT (SUB1 X))))))).%1
.APART
In this example we will assume that the binary function, *, the
unary predicate, %3zerop#<=#λ[[x]x#=#0]%* and unary function,
%3 sub1#<=#λ[[x]#x-1]%* are known and are
recognized in the evaluator by %3TIMES, ZEROP%* and %3SUB1%* respectively.
.BEGIN NOFILL;TURN ON "\";TABS 10,20,30;
.GROUP
%1Then%3 eval[(FACT 3);st]
\= apply[FACT;evlis[(3);st];st]
\= apply[(LAMBDA(X)(COND ...));3;st]
\= eval[(COND((ZEROP X)1)(T( ...)));((X . 3) . st)]
\= evcond[(((ZEROP X)1)(T(TIMES X(FACT (SUB1 X)))));((X . 3) . st)]
%1(Now, let %3st1%* be%3 ((X . 3) . st) )
\= eval[(TIMES X(FACT (SUB1 X))); st1]
\= apply[TIMES;evlis[(X (FACT (SUB1 X))); st1];st1]
\= apply[TIMES;cons[3;evlis[((FACT (SUB1 X))); st1];st1]
%1Now things get a little interesting inside %*evlis:
\evlis[(((FACT (SUB1 X)));st1]
\\= cons[eval[(FACT (SUB1 X)); st1];NIL]
\\%1 and %* eval[(FACT (SUB1 X));st1]
\\\= apply[FACT;evlis[((SUB1 X));st1];st1]
\\\= apply[FACT; (2);st1]
\\\= apply[(LAMBDA (X)(COND ...));(2);st1]
\\\= eval[(COND((ZEROP X) 1) ...));((X . 2) . st1)]
\\\...
.APART
.FILL
%1
Notice that within this latest call on %3eval%* the symbol-table-searching function,
%3assoc%*, will find the pair %3(X . 2)%* when looking for the value of %3x%*. This is
as it should be. But notice also that the older binding, %3(X . 3)%*, is still
around in the symbol table %3st1%*, and will become accessible once we complete
this latest call on %3eval%*.
.GROUP
As the computation continues, the symbol table is proceeding initially from,
.NOFILL
%3
←st =((FACT LAMBDA(X)(COND ...))) %1to%3,
←((X . 3) . st) %1to%*,
←((X . 2)(X . 3) . st) %1to%*,
←((X . 1)(X . 2)(X . 3) . st) %1to finally%*,
←((X . 0)(X . 1)(X . 2)(X . 3) . st).
%1
.FILL
.APART
Since the search function, %3assoc%1, always proceeds from left to right through
the table and since the table entry function, %3pairlis%*, always %3cons%*es pairs
onto the front of the table before calling %3eval%*, we will get the correct
binding of the variables.
This symbol table mechanism is very important, so let's look at it
again in a slightly different manner.
.END
.<<the | | symbol table >>
.P43:
We will represent calls on %3eval%* informally, using LISP expressions
rather than Sexprs as input. We represent the symbol table between vertical
bars, "|", in such a way that if a table, t%41%*, is:
.BEGIN TABIT2(10,17);
\| b%4n%*\|
\| ...\| then %3cons%*ing a new element, b%4n+1%* onto t%41%* gives:
\| b%41%*\|
\| b%4n+1%*\|
\| b%4n%*\|
\| ...\|
\| b%41%*\|
.END
%3
.BEGIN NOFILL;TABS 10,46,58;TURN ON "\";
%1Then:%*
eval[`fact[3]'; |fact = λ[[x][x=0 → 1;T → *[x;fact[x-1]]]]|]
.GROUP
\= eval[`λ[[x][x=0 → 1; T → *[x;fact[x-1]]]];\|x = 3 | ]
\\|fact = λ[[ ... |
.APART
.GROUP
\= *[3;eval[`λ[[x] ....'; \| x = 2 |]
\\| x = 3 |
\\|fact = λ[[ ...] |
.APART
.GROUP
\= *[3; *[2;eval[`λ[[x] ... ';\| x = 1 |]
\\| x = 2 |
\\| x = 3 |
\\|fact = λ[[ ... |
.APART
.GROUP
\= *[3; *[2; *[1;eval[`λ[[x] ... ';\| x = 0 |]
\\| x = 1 |
\\| x = 2 |
\\| x = 3 |
\\|fact = λ[[ ... |
.APART
.GROUP
\= *[3; *[2; *[1;1]]] %1with:%*\| x = 1 |
\\| x = 2 |
\\| ... |
.APART
.GROUP
\= *[3; *[2;1]] %1with:%*\| x = 2 |
\\| ... |
.APART
.GROUP
\= *[3;2] %1with:%*\| x = 3 |
\\| ... |
\= 6. %1with:%*\|fact = λ[[ ... |.
= 6
.END
%1
Notice that after we went to all the trouble to save the old values
of %3x%* we never had to use them. However, in the general case of
recursive evaluation we must be able to save and restore the old values
of variables.
For example recall the definition of %3equal:
.BEGIN NOFILL TURN ON "\";TABS 17;
%3
equal <= λ[[x;y][atom[x] → [atom[y] → eq[x;y];T → NIL];
\ atom[y] → NIL;
\ equal[car[x];car[y]] → equal[cdr[x];cdr[y]];
\ T → NIL]]
%1If we were evaluating:%3
←equal[((A . B) . C);((A . B) . D)],
%1
then our symbol table structure would be changing as follows:
.END
%3
.BEGIN NOFILL;TABS 10,40;TURN ON "\";
.GROUP
\|equal = λ[[x;y] ...| ==>\|x = ((A . B) . C)| ==>
\\|y = ((A . B) . D)|
\\|equal = λ[[x;y]... |
\| x = (A . B) |\| x = A |
\| y = (A . B) |\| y = A |
\| x = ((A . B). C)| ==>\| x = (A . B) |
\| y = ((A . B). D)|\| y = (A . B) | ==>
\|equal = λ[[x;y ... |\| x = ((A . B). C) |
\\| y = ((A . B). D) |
\\|equal = λ[[x;y] ...|
.APART
\| x = B |\| x = C |
\| y = B |\| y = D |
\| x = (A . B) |\| x = ((A . B). C) | ==>
\| y = (A . B) | ==>\| y = ((A . B). D) |
\| x = ((A . B). C) |\|equal = λ[[x;y] ... |
\| y = ((A . B). D) |
\|equal = λ[[x;y] ... |
←|equal = λ[[x;y] ... |
.END
%1
This complexity is necessary, for while we are evaluating
%3equal[car[x];car[y]]%*, we rebind %3x%* and %3y%* but we must save the old
values of %3x%* and %3y%* for the possible evaluation of %3equal[cdr[x];cdr[y]].%*
But the claim is that using %3pairlis%* to cons the new
bindings onto the front of the symbol table as we call %3eval%*
does the right thing. That is, in %3apply:
←eq[car[fn];LAMBDA] → eval[caddr[fn];pairlis[cadr[fn];x;a].
%1
The tricky part is that when we leave that particular call on
%3apply%*, the old table is automatically restored by the recursion
mechanism. That is, if %3st%* is the current symbol table then
%3cons%*-ing things onto the front of %3st%* doesn't change %3st%*, but if we
call %3eval%* or %3apply%* with a symbol table of say:
←%3cons[(X . 2);cons[(X . 3); st]] %*
then in %2that%* call on %3eval%* or %3apply%* we have access to %3x = 2%*, not
%3x = 3%*.
.P93:
Before moving on we should examine %3eval%* and %3apply%* to see how the
compare with our previous discussions of LISP evaluation.
The spirit of call-by-value and conditional expression evaluation is maintained.
λ-binding seems correct, though our current discussion is not complete.
At least one preconception is not maintained here. Recall the discussion on
{yon(P95)}. We wanted n-ary functions called with exactly n arguments. An
examination of the structure of %3eval%* and %3apply%* shows that
if a function expecting %2n%* arguments is presented with less, then
the result is undefined; but if it is given %2more%* than necessary
then the calculation is performed. For example:
.BEGIN TABIT1(33);GROUP;SELECT 3;
eval[(CONS(QUOTE A)(QUOTE B)(QUOTE C));NIL]
\= eval[(CONS(QUOTE A)(QUOTE B));NIL]
\= (A . B)
.END
.BEGIN TURN ON "#";
Let's look more closely at λ-binding. The scheme presented seems reasonable,
but as in the case above, there is more going on here than we are perhaps
aware of. In particular, we are thinking of the question of %2⊗→free variable↔←s%*.
Consider the function: %3λ[[x]x#+#y]%*. The variable %3x%* is called a
%2⊗→bound variable↔←%*; the variable %3y%* is a free variable. When we
apply this function, we will associate %3x%* with the value of the actual
parameter, but this λ-binding will not effect %3y%*. Clearly, %3y%*'s
value will determine the course of the computation. For example, if %3y%*
has value %31%*
then our function is %3add1%*, if %3y%*'s value is %3-1%*, then %3sub1%*.
How are we to give a value to %3y%*? We do so by λ-binding.
For example %3 λ[[y]#λ[[x]x#+#y][2]][1]%* binds %3y%* to %31%* and
then %3x%* to %32%*. Rest assured (though you should convice yourself) that
%3eval%* handles free or ⊗→global variable↔←s consistent with this interpretation.
Handling of global variable varies from programming language to programming
language. The solution introduced by LISP is called ⊗→dynamic binding↔←.
This binding strategy advocates evaluation of free variables at the time
of activation of the function rather than at the time of definition
of the function ⊗↓The latter solution is used in ALGOL 60.←.
We first wish to develop a useful notation before delving further into
the intracacies of binding strategies.
.END
.CENT(Problems)
%2I%* Which parts of %3eval%* and Co. allow correct evaluation functions
applied to too many arguments?
%2II%* On {yon(P93)} we noted that the evaluator performs "correctly"
when evaluating forms like %3cons[A;B;C]%*.
Can you find other anomalies in %3eval%* and friends?
.SS(Environments and bindings,,P77:)
.<< the Weizenbaum symbol table>>
This section will introduce one more notation
for describing symbol tables or environments. This notation, due to
J. Weizenbaum, only shows the abstract structure of the symbol table manipulations
during evaluation. Its simplicity will be of great benefit when we introduce
more complex binding schemes necessary for function-valued functions.
See {yonss(P76)}.
The later work of Bobrow and Wegbreit covers much of this topic but
in more detail than is needed for clear understanding.
An environment will be described as :
.BEGIN TABIT2(36,40);
\\Form
\\E%4i%*
\\|
\ E%4c%*\| E%4a%*
\_________
\var\| value
\\|
\ .....
\\|
.END
Form is the current form being evaluated.
E%4i%* is the name of the current environment or symbol table. The evaluation
of the current form takes place with respect to this environment. Variables
appearing in the form are evaluated in this environment. If the variable is
not found in E%4i%* then E%4a%* is then searched. If the variable is not found
in E%4a%* then the environment mentioned in the upper right-hand quadrant
of E%4a%* is searched. The search will terminate if the variable is found
or if the symbol "/" appears in the right-hand quadrant. In this second
case the variable is undefined. E%4a%* is the beginning of the
%2⊗→access evnironment↔←%* for global variables.
E%4c%* is called the %2⊗→control environment↔←%*. This environment
is the symbol table to be restored when the evaluation on the current
form has been completed. In all examples we have seen so far E%4a%* has
been the same as E%4c%*. Things will soon get more complex.
For now, the notation is used as follows: when we begin, an initial table E%40%* is
set up with "/" in its access and control fields. Execution of "<=" will
add a var-val entry to the table. When a λ-expression is entered, i.e.,
when we bind the evaluated arguments to the λ-variables, a new E%4i%* is set up
with access and control being the current environment.
Entries reflecting the binding of the λ-variables are made in E%4i%* and
evaluation of the λ-body is begun. When the evaluation is completed the old
control environment is restored as the current environment.
First, consider the evaluation of %3f[2;3]%* where %3f#<=#λ[[x;y]x%82%*#+#y]%1.
The flow of symbol tables is:
.BEGIN TURN ON "\";NOFILL;TABS 10,13,18,23,26,41,46,49,54,57,60,65,70,73;
.GROUP
\\%3f <= ...\\\f[2;3]\\\x%82%* + y%1
\\E%40%*\\\E%40%*\\\E%41%*\\\E%40%*
\ /\| /\\ /\| /\\E%40%*\| E%40%*\\ /\| /
\______\=>\______\=>\______\=>\______
\\|\ \ %3f\| λ[[x;y]x%82%* + y]%1\ \%3x\| 2\ \f\| λ[[x;y] ...
\\\\\\\ y\|3%1
.APART
.FILL;
The execution of %3fact[3]%* on {yon(P42)} results in a more interesting
example.
.BEGIN TURN ON "\";NOFILL;TABS 10,13,18,23,26,31,36,39,44,47,50,55,60,63;
.GROUP
\\%3fact[3]\\\fact[2]\\\fact[1]\\\fact[0]%1
\\E%40%*\\\E%41%*\\\E%42%*\\\E%43%*\\\E%44%* ...etc.
\ /\| /\\E%40%*\| E%40%*\\E%41%*\| E%41%*\\E%42%*\| E%42%*\\E%43%*\| E%43%* ...
\______\=>\______\=>\______\=>\______\=>\______
\%3fact\| λ[[x] ...\ x\| 3\\ x\| 2\\ x\| 1\\ x\| 0 ...%1
Notice in this example we will get the correct binding of %3x%*.
.APART
.END
.GROUP
As a final example showing access to ⊗→free variable↔← bindings consider:
%3f[3]%* where: %3 f <= λ[[x]g[2]]%* and %3g <= λ[[y] x+y]%*.
.NOFILL;
\%3f <= ...; g <= ...\f[3]\\\g[2]\\\x + y ....
\\E%40%*\\\E%40%*\\\E%41%*\\\E%42%* ....
\ /\| /\\ /\| /\\E%40%*\| E%40%*\\ E%41%*\| E%41%*
\______\=>\______\=>\______\=>\______\ ......
\\|\\ %3f\| λ[[x] g[x]]\\%3x\| 3\ \y\| 2 ...
\\\\%3g\| λ[[y] x+y]
.END
Notice that when we attempt to evaluate %3x + y%* we find %3y%* has a local
value, but we must look down the access chain to find a binding for %3x%*.
Obviously you should convince yourself that the access- and binding-scheme
used by LISP is faithfully described in these diagrams.
.REQUIRE "PROB8.PUB" SOURCE_FILE;
.SS(%3label%*,%3label%*,P38:)
One effect of placing "λ" and a list of λ-variables in front
of an expresson is to bind those variables which appear both in
the λ-list and the expression. All other variables
appearing in the expression are %2⊗→free variables↔←%*.
For example, %3f%* is free in the following:
←%3f <= λ[[x][zerop[x] → 1; T → *[x;f[x-1]]] ].%1
Clearly our intention is that the %3f%* appearing the the right of "<="
is the same %3f%* appearing to the left of "<=". That is we want %3f%* to be
a bound variable and not free.
This has not been a problem for us. We have simply pre-loaded the
symbol table, binding %3f%* to its definition (or value). See {yon(P42)}. LISP
has been equipped with a slightly more elegant device for this binding.
It is called the %3label%* operator. It is called thus:
←%3label[%1<identifier>;<function>]
and has exactly the effect of binding the <identifier> to the <function>.
To include %3label%* in the LISP syntax add:
.BEGIN TABIT1(11);CENTERIT;
<function>\::= %3label%*[<identifier>;<function>]
and the sexpr translation of the %3label%* construct should naturally be:
←%3(LABEL %1translate of <identifier> translate of <function> %3)%1.
Note that %3label%* should be a special form.
.END
.BEGIN TURN ON "#";
With the introduction of %3label%* we can talk more precisely about "<=".
When we write "what is the value of %3f[2;3]%* when %3f#<=#λ[[x;y]#...#]%*?"
we mean: evaluate the form %3[label[f;#λ[[x;y]#...]][2;3]]%*. If %3f%*
is not recursive, then the %3label%* application is unnecessary. The
anonymous function %3λ[[x;y]#...#]%* applied to %3[2;3]%* suffices.
What about statements like "evaluate %3g[A;B]%* where
%3g#<=#λ[[x;y]#...#f[u;v]#...]%* and %3f#<=#λ[[x;y]#...#]%1."?
%3label%* defines only one function; we may not say %3label[f,g;#...#]%*.
What we %2can%* do embed the %3label%*-definition for %3f%* within
the %3label%*-definition for %3g%*⊗↓Indeed %2every%* occurrence of %3f%*
must be replaced by the %3label[f;...]%1construct.←. Thus:
.BEGIN CENTER;SELECT 3;
label[g;#λ[[x;y]#...#label[f;#λ[[#...#]#...#][u;v]#...]%*
.END
The perspicuity of such constructions is minimal. Needless to say,
implementations of LISP offer better definitional facilities.
.END
The apparent simplicity of the %3label%* operator is partly due to
misconception and partly due to the restrictions placed on the current
subset of LISP. %3label%* appears to be a
rather weak form of an assignment statement. When we extend LISP to include
assignments we can easily show that such interpretation of %3label%* is
insufficient;
when we talk about a mathematical interpretation of LISP
we will show that %3label%* really requires careful analysis.
.CENT(Problem);
I. Show one way to change %3eval%* to handle %3label%*.
II. Express the definition of %3reverse%* given on {yon(P97)} using
%3label%*.
.SS(functional arguments,functional argument,P76:)
.BEGIN TURN ON "#","←";
Recall our discussion of :
%3eval[(F#2#3);((F#.(LAMBDA(X#Y)(PLUS#(EXPT#X#2)Y))))].%*
We now know this is equivalent to:
%3eval[((LABEL#F#(LAMBDA(X#Y)(PLUS#(EXPT#X#2)#Y)))#2#3);NIL]%1.
In either case, the effect is to bind the name %3f%* to the λ-expression.
Binding is also going on when %3f%* is called: we bind %32%* to %3x%*, and %33%*
to %3y%*. Acknowledging Occam, why not attempt to combine the binding processes?
For example if %3 twice[x]%* is defined to be %3plus[x;x]%*, and %3foo[f;x]%*
were %3f[x]%*, then a truly expensive way of evaluating %3twice[2]%* would appear
to be %3foo[twice;2]%*.
.P81:
This will almost work; as usual %3foo%* would be considered a %2function%* by %3eval%*
and the arguments would be evaluated. We don't want the %2value%* of %3twice%*, we
want the %2name%*. So to stop evaluation, we might try %3QUOTE%*-ing %3twice%* when
we call %3foo%*; thus %3(FOO#(QUOTE#TWICE)#2)%*. %3QUOTE%*-ing is not quite
good enough, as we will see in a moment, but it would work here. %3f%* in the definition
of %3foo%* is called a %2functional argument%*; %3f%* is a λ-variable, but appears
in the body of the function where we expect a function. You, of course,
should realize by now that %2all%* function names appearing in our LISP expressions
are variables; hopefully they are bound to %2something%* (primitives or λ-expressions)
before we attempt to evaluate them.
Using Weizenbaum's environments we might get:
.BEGIN GROUP;TURN ON "\";NOFILL;TABS 10,13,23,28,31,46,51,54,59,62,65;
\%3foo[twice;2]\\f[x]\\ x + x%*
\\E%40%*\\\E%41%*\\\E%42%*
\ /\| /\\ E%40%*\| E%40%*\\ E%41%*\| E%41%*
\______\=>\______\=>\______
%3 twice\\| λ[[x] x + x\\ x\| 2\\ x\| 2
%3 foo\\| λ[[f;x] f[x]]\\ f\| "twice"
%1So we %2will%* get the right bindings.
.END
If such baroque examples were the limit of functional arguments, their usefulness
would be questionable, however they are both a powerful programming tool and their
implementation sheds much light on programming language design (see {yonss(P3)}).
In particular, most implementations of LISP include a very useful class
of ⊗→mapping functions↔←.
.BEGIN INDENT 0,10;
⊗→%3maplist%*↔← is a function of two arguments, a list, %3l%*, and a functional
argument, %3fn%*. %3maplist%* applies the function %3fn%* (of one argument)
to the list %3l%* and its %3cdr%*s until %3l%* is reduced to %3NIL%*.
The value of %3maplist%* is the list of the values returned by %3fn%*.
Here's a definition of %3maplist%*:
.END
.GROUP;
←%3maplist <= λ[[fn;l][null[l] → NIL; T → cons[fn[l];maplist[fn;cdr[l]]]]]%*.
Thus:
%3←maplist[REVERSE;(A B C D)] = ((D C B A)(D C B)(D C)(D))%*.
.APART;
.BEGIN INDENT 0,10;
⊗→%3mapcar%*↔← is a similar function, applying %3fn%* not to %3l%* but the
%3car%* of %3l%*. Thus:
.END
.GROUP
.P34:
←%3mapcar <= λ[[fn;l][null[l] → NIL; T → cons[fn[car[l]];mapcar[fn;cdr[l]]]]].%*
and an example:
←%3mapcar[(LAMBDA(X)(CONS X NIL));(A B C D)] = ((A)(B)(C)).%*
.APART
To see why %3QUOTE%* is not sufficient, consider the following strange example:
.BEGIN TABIT2(10,29);GROUP;
%3\tester <= λ[[l;fn]\[null[l] → NIL;
\\ atom[l] → fn[ ];
\\ T → tester[car[l];(LAMBDA()(TESTER(CDR L) FN))]]]%*
.END
Notice %3fn%* is a functional argument and we are %3QUOTE%*-ing the actual
parameter bound to %3fn%*
in the recursive call on the function, %3tester%*.
Now if we call %3tester%* with: %3tester[(A#(B)#(C));(LAMBDA()(BAZ#(QUOTE#A))]%*
the following occurs:
%21.%* %3l%* is bound to %3(A (B) (C))%*; %3fn%* is bound to
%3(LAMBDA()(BAZ#(QUOTE#A)))%*.
%22%*. In the conditional expression of %3tester%* p%41%* and p%42%* are false
and thus we evaluate the "otherwise" branch.
%23%*. We rebind %3l%* to %3A%*, rebind %3fn%* to %3(LAMBDA()(TESTER(CDR#L)#FN))%*
and lose big.
%3null[l]%* is false, but %3atom[l]%* is true so we evaluate %3fn%*.
When we attempt to evaluate %3cdr[l]%* we get the wrong binding of %3l%*.
What we find bound to %3l%* is %3A%*, rather than %3(A#(B)(C))%* which we
expected.
The environment at the point of call in e%43%* of the body of %3tester%* should
have been associated
with %3fn%* when we rebound %3fn%*. That is, we must associate the
symbol table which was current at the %2point of call%* with the functional argument.
We must now modify our use of the Weizenbaum environments to show this association.
When an assignment of the form %3f#<=#λ[[#...#]#...#]%* is performed we
will add the name %3f%* as usual but will now add %3λ[[#...#]#...#]#:%1E%4i%1
as value where E%4i%* is the environment at the point of assignment. When %3f%*
is %2called%* we set up a new environment as before, but the access environment
will now be set to E%4i%*.
This solution might seem a bit drastic, but it is easy to construct
counterexamples to less sophisticated solutions.
For example, attempting to substitute value for the free variables at the time
"<=" is done is not sufficient.
Consider the following example:
.BEGIN TURN ON "\";NOFILL;TABS 10,13,28,33,36,51,56,59,64,67,70;
\\%3g[3]\\\f[2]\\\x + y + z ....%1
\\E%40%*\\\E%41%*\\\E%42%* ...
\ /\| /\\E%40%*\| E%40%*\\E%41%*\| E%40%* ....
\______\=>\______\=>\______ ....
\%3f\| λ[[x]x+y+z]:%1E%40%3\\y\| 3\\x\| 2
\g\| λ[[y]f[2]]
\z\| 4
.END
Notice that we could substitute the value for %3z%* when %3f%* is defined
but we cannot do so for %3y%*. Even if there were a value for %3y%* at this time
it would be the wrong value.
What does this say about functions? We have already remarked that functions are
parametric values; to that we must add: functions are also tied to the environment
in which they were created; they cannot be evaluated %3in vacuo%*.
.P89:
What does this say about "<="? It %2still%* appears to act like an assignment statement.
It is taking on more distinct character since it must associate environments with
the function body as it does the assignment.
The device LISP used to associate environments with functions is called
the ⊗→%3FUNARG%*↔← hack. (More is said about %3FUNARG%* in {yonss(P3)}.)
Instead of designating an instance of a functional argument by the %3QUOTE%*
operator, we introduce a new operator called ⊗→%3FUNCTION%*↔←. Thus:
←%3function[λ[[]tester[cdr[l];fn]]] %*or,
←%3(FUNCTION(LAMBDA()(TESTER (CDR L) FN))).%*
When %3eval%* sees the construction %3(FUNCTION %1fn%3)%* it returns as value
the list:
←%3(FUNARG###%*fn### current-symbol-table%3)%*.
When %3eval%*, or actually %3apply%* sees %3(FUNARG %*fn st%3)%*, that is,
when we are %2calling%3 fn%1, we use the symbol table %3st%*, rather than
the current symbol table for accessing global variables.
.<<pic of funarg execution>>
.P79:
Let's follow the behavior of an %3eval%* and %3apply%* family which
has been modified to handle functional arguments.
Let %3fn <= λ[[x%41%*; ... ;x%4n%*] ...]%1 be a function which will be used
as a functional argument in the context, say:
.BEGIN SELECT 3; CENTERIT;GROUP;
←fie <= λ[[ ...;foo ...] .... foo[ ...] ]
←fie[ ...; function[fn]; ...]
.END
in the sequel %3st%* and %3st%4save%1 will name symbol tables. "→" will mean
"points to" (i.e.,#%3cons%*). p%4i%*'s will be dotted pairs in a symbol table.
Then let's see what calling %3fie[#...;#function[fn];#...]%* will do.
.BEGIN TABIT2(10,60);GROUP
\%3(FIE ... (FUNCTION FN) ...) st:\NIL ⊗↓%3st%1 is not really empty.
Obviously it contains the function definitions.←
\ ....
\%1computation %3st%1 :\p%4n%* → p%4n-1%* → ... p%41%*
\%3(FUNCTION FN)
\%1 gives:
\%3(FUNARG FN st%4save%*)
\ ....
\%1computation %3st%1 : p%4m%* → p%4m-1%* ...→
\%3(FOO %1a%41%* ... a%4n%*)
\%1 gives:
\%3((FUNARG FN st%4save%*)%1a%41%* ... a%4n%3)
.BEGIN FILL;NARROW 0,40;
%1The a%4i%*'s are evaluated in the context %3st%* %2not%* %3st%4save%1 by
%3evlis[#...;#st]%* giving v%4i%1's.
We then apply %3fn%* to the v%4i%*'s in the context %3st%4save%1 %2not%*
in environment %3st%* by calling %3apply[#...#;#...#;st%4save%*].
%1This results in:
%3eval[ %1body of %3fn%*; %3((x%41%1 . v%41%3) → ... (x%4n%1 . v%4n%3)%1 →
.END
.END
After we finish this inner call on %3apply[#...#;#...#;st%4save%*]%1 the table %3st%*
is restored. Notice that our symbol tables are now really tree-like rather than
stack-like.
It should be apparent from %3eval%* that %3(QUOTE ...)%*
and %3(FUNCTION ...)%* will have the
same effect if %3fn%* has no global (or free) variables.
This seems like a lot of work to allow a moderately obscure construct to appear in a language.
However constructs like functional arguments appear in several programming languages
under different guises. Usually the syntax of the language is sufficiently
obfuscatory that the true behavior and implications of
devices like functional arguments is misunderstood. Faulty implementations
usually result. In LISP the problem %2and the solution%* appear
with exceptional clarity.
Functional arguments may be exploited to describe very general control structures.
We will say more about this application later.
Finally, here is a sketch of the abstract structure of the current %3eval%*.
.BEGIN SELECT 3;GROUP;TABIT1(12);
eval <= λ[[exp;environ]
\[isvar[exp] → value[exp;environ];
\ isconst[exp] → denote[exp];
\ iscond[exp] → evcond[exp;environ];
\ isfun[exp] → makeclosure[exp;environ];
\ isfunc+args[exp] → apply[func[exp];list_of_evaled_args[exp;environ];environ]]
.APART
.GROUP
%1where:%*
apply <= λ[[fn;args,environ]
\[isfunname[fn] → ....
\ islambda[fn] → eval[body[fn];newenviron[vars[fn];args;environ]]
\ isclosure[fn] → apply[func%41%*[fn];args;evn[fn]]
\ .... ..... ]]
.APART
.END
←%2Problems%*
I. What changes should be made to the LISP syntax equations to
allow functional arguments.
II. Use %3foo%* on {yon(P81)} to define a function which computes factorial without
using %3label%* or explicit calls on the evaluator.
III. Extend %3eval%* and friends to functional arguments.
.GROUP;
.P99:
IV. An interesting use of functional arguments involves ⊗→self-applicative functions↔←.
An application of a function %3f%* in a context %3f[...,f,...]%* is an instance
of self application ⊗↓provided the designated argument position is a functional
argument←. Self-applicative functions can be used to define recursive
functions in such a way that the definition is not %2statically%* self-referential,
but is %2dynamically%* re-entrant.
For example here is our canonical example, written using self-applicative functions:
.APART
.BEGIN CENTER;SELECT 3;GROUP;
fact <= λ[[n]f[function[f]; n]]
f <= λ[[g;n][n=0 → 1; T → *[n; g[g; n-1]] ]]
.END
Use Weizenbaum's environments to show the execution of %3fact[2]%*.
.END
.SS(special forms,special form,P8:)
We have remarked that the evaluation scheme for LISP functions is
call-by-value and, for functions with multiple arguments, left-to-right
evaluation of arguments. We have also seen, in %3QUOTE%* and %3COND%*,
that not all forms to be evaluated in LISP fall within this category.
We have already noted on {yon(P44)} that %3QUOTE%* and %3COND%* are
%2not%* translates of functions. Clearly we don't evaluate the arguments to
%3(QUOTE ...)%*; indeed the purpose of %3QUOTE%* was to %2stop%* evaluation.
Also the `arguments' to %3COND%* are handled differently; their evaluation was
handled by %3evcond%*.
We therefore have called %3QUOTE%* and %3COND%* %2special forms%*.
We would like to discuss special forms as a generally useful technique.
Consider the predicates, ⊗→%3and%*↔← and ⊗→%3or%*↔←. We might wish to define
%3and%* to be a binary predicate such that %3and%* is true just in case
%2both%* arguments evaluate to %3T%*; and define %3or%* to be binary
and false just in case both arguments evaluate to %3NIL%*.
Notice two points. First, there is really no reason to restrict these predicates
to be %2binary%*. Replacing the words "binary" by "n-ary" and "both" by "all"
in the above description has the desired effect.
Second, if we evaluate the arguments to these predicates in some order,
say left-to-right, then we could immediately determine that %3and%*
is false as soon as we come across an argument which
evaluates to %3NIL%*; similarly a call on %3or%* for an arbitrary number
of arguments can be terminated as soon as we evaluate an argument giving
value %3T%*.
But if we insist that %3and%* and %3or%* are %2functions%* we can
take advantage of neither of these observations. Functions have a fixed
number of arguments, all of which are evaluated. The solution should be
clear: define %3and%* and %3or%* as special forms and handle
the evaluation ourselves. Presently, the
only way to handle special forms is to make explicit modifications
to %3eval%*. This is not terribly difficult (or desirable). When
we see a more realistic version of %3eval%* and Co. in {yonss(P30)}, we will have
a simple way to add such forms. See also {yon(P54)}.
Recognizers for the predicates must be added to %3eval%*:
.BEGIN CENTERIT;SELECT 3;
←eq[car[e];AND] → evand[cdr[e];a];
←eq[car[e];OR] → evor[cdr[e];a];
←.....%1 where:
.END
.BEGIN SELECT 3;TABIT1(16);
.P53:
evand <= λ[[l;a]\[null[l]→ T;
\ eval[car[l];a] → evand[cdr[l];a];
\ T → NIL] ]
evor <= λ[[l;a]\[null[l] → NIL;
\ eval[car[l];a] → T;
\ T → evor[cdr[l];a]] ]
.END
Notice the explicit calls on %3eval%*. This is expensive, but cannot be helped.
Later we will show a less costly way to handle those "non-functions" which
only have an indefinite number of arguments, all of which are
to be evaluated (see {yonss(P18)} on macros).
←%2Problems%*
I What is the difference between a special form and call-by-name? Can call-by-name
be done in LISP (without redefining %3eval%*)?
.P71:
II %3select%* is a special form to be called as:
%3select[q;q%41%*;e%41%*;#...#;q%4n%*;e%4n%*;e]%1 and to be evaluated as follows:
%3q%* is evaluated; the %3q%4i%1's are evaluated from left to right until
one is found with the value of %3q%*. The value of %3select%* is the value
of the corresponding %3e%4i%1. If no such %3q%4i%1 is found the value of
%3select%* is the value of %3e%1. %3select%* is a precursor of the
%2⊗→case statement↔←%*. See {yon(P70)}.
Add a recognizer to %3eval%* to handle %3select%* and write a function to
perform the evaluation of %3select%*.
.SS(The %3prog%*-feature,,P37:)
.TURN ON "←";
%1
Recursion seems a bit supernatural, but it is legal, mechanizable
and rather cool.
There is another similar ⊗→bind↔←ing process occurring in LISP. It is
connected with an iterative style of LISP programming called the
⊗→%3prog%*↔←-feature.
First a general disclaimer: Some pure ("For my next trick I'll walk
on the water") LISP programmers feel that there is something slightly
obscene about writing iterative style programs. This isn't quite
true, but there are some good reasons for emphasizing recursion.
%21.%* Anyone can write an iterative scheme. Recursion is a powerful
tool and very possibly a new programming technique
for you. You should exercise your skill and resort to the %3prog%*
as a last resort.
%22.%* Two of the usual trappings of iteration are %2labels%* and %2go-to%*'s.
These are truly tools of the devil. In recent years several studies
by reasonable men have shown that algorithms which resort to labels
and gotos tend to be harder to read, harder to modify, sloppy in their
analysis of the process being coded, and generally ugly. The label
and goto hack invites patching over poor analysis instead of
reanalyzing the problem.
%23.%* With the control of the loop structure (either recursion or some
kind of controlled iteration, e.g., a FOR or WHILE statement) in the
hands of the language rather than in the hands of the programmer, the
static text and the dynamic flow of the execution have a parallel
relationship. This should have a beneficial effect for studies
investigating the provability of properties of programs.
Now that this has been said, here's our discussion of %3prog%*s.
Many algorithms present themselves more naturally as iterative
schemes. Recall the recursive computation of the length of a list given
on {yon(P19)}. %3length%* seems inordinately complex; our sympathies
lie more with %3length%41%1. Even this is not as straightforward as you would
expect for such a simple calculation. Rather, consider the following:
%21.%* Set a pointer to the given list. Set a counter to zero.
%22.%* If the list is empty, return as value of the computation, the
current value in the counter.
%23.%* Otherwise, increment the counter by one.
%24.%* Set the pointer to the cdr of the list.
%25.%* Go to line 2.
The new ingredients here are:
%21.%* An ⊗→assignment↔← statement: "set a ...".
%22.%* Access to some new cells: the pointer, the counter.
%23.%* Sequencing (albeit usually implied) between statements: line %21%*, then line %22%*...
%24.%* Gotos and labels.
%25.%* An explicit exit from the procedure: line %22%*.
Here is the LISP %3prog%* version of the length function:
.BEGIN TABIT2(15,17);SELECT 3;GROUP;TURN OFF "←";
.P45:
⊗→length↔← <= λ[[x]prog[[l;c]
\\l ← x;
\\c ← 0;
\a\[null[l] → return[c]];
\\l ← cdr[l];
\\c ← c+1;
\\go[a]] ]
.END
A few points should be made: The ⊗→%3prog%*-variables↔←, %3l%* and %3c%1, are ⊗→local variable↔←s.
That is, they only have meaning within this definition of %3length%*. If they
appeared in some program which called %3length%*, then the right thing
would happen; the old values would be saved (like λ-bindings) and then
restored after the call on %3length%* has been completed. Among other things,
this allows %3prog%*s to be used recursively.
Though assignment statements are normally executed for their effect on the
environment -- they have side-effects, assignments in LISP also have a value.
The value of an assignment is the value of the right-hand-side.
Conditional expressions have a slightly different effect inside %3prog%*s. If
none of the predicates in the conditional are true, then the statement following
the conditional is executed.
.P83:
Labels are clear; they are identifiers. Labels are local, that is must be found
within the body of the current %3prog%*. Labels may conflict with the
λ-variables or %3prog%*-variables; the evaluator for %3prog%*s can resolve the
conflicts by context.
⊗→%3go%*↔← is a little more complicated. %3go%* is a special form. If the argument
to %3go%* is %2atomic%* then it is interpreted as a (local) label; otherwise, the argument
is %2evaluated%* and the result of the evaluation is interpreted as a label.
This results in a useful programming trick. Let %3l%* be a list of dotted pairs,
each of the form, %3(%* object%4i%* . label%4i%3)%1. At each label%4i%* we
begin a piece of program to be executed when object%4i%* has been recognized.
Then the construct:
.P25:
←%3go[cdr[assoc[x;l]]]%* (*1*),
can be used to "dispatch" to the appropriate code when %3x%* is one of the
object%4i%*. This is an instance of %2⊗→table-driven↔←%* programming.
This programming trick shows "go-to" programming in its most pervasive form.
The blocks of code dispatched to can be distributed throughout the body of the
%3prog%*. Each block of code will usually be followed by a %3go%* back to
the code involving equation *1* (above). In fact the argument %3l%* in *1*
may be %2global%* to the %3prog%*-body.
The general effect is to make a %3prog%* which is very difficult to understand.
The LISP %3select%* ({yon(P71)}) will handle many of the possible applications
of this coding trick and result in a more readable program. The case-statement ({yon(P70)})
present in some other languages is also a better means of handling this problem.
The ⊗→%3return%*↔← statement evaluates its argument, and returns that value as
the value of the %3prog%*.
When a %3prog%* is entered the %3prog%*- (local) variables are initialized to %3NIL%*.
The body of the %3prog%* is a sequence of statements and labels. Statements are
executed sequentially unless a %3go%* or %3return%* is evaluated.
If a %3prog%* runs out of statements then %3NIL%* is returned.
Returning from a %3prog%* unbinds the local variables.
Now to the problem of translating %3prog%*s into a s-expression representation.
We need be a little careful. First notice that %3prog%* is a %2⊗→special form↔←%*
(like %3COND%* and %3QUOTE%*). The construct:
←%3prog[[l;c]....]%* will be translated to:
←%3(PROG(L C) ...)%*.
But %3PROG%* is not the translate of a function
any more than %3(QUOTE ...)%* or %3(COND ...)%* is.
So the body of the %3prog%* must be handled specially by a
new piece of the evaluator.
.TURN OFF "←";
Similarly we must be careful about the interpretation of ←. First,
we will write %3x ← y%* in prefix form: %3←[x;y]%*. Now, using our usual LISP
technique we might map this to:
.BEGIN CENTER
%3(SET X Y).%* Is ← or ⊗→%3SET%*↔← a function in the usual LISP sense?
.END
Not likely; for if %3x%* and %3y%* have values %32%* and %33%*, for example, then the
call-by-value interpretation of %3←[x;y]%* would say %3←[2;3]%*. This was not our
intention, hopefully. What do we want? We want to evaluate the second argument
to ← while stopping the evaluation of the first argument. But LISP does
have a device for stopping evaluation: ⊗→%3QUOTE%*↔←. So we can define %3SET%* as a normal
LISP function, provided we
.BEGIN CENTER
translate %3x ← y%* as %3(SET (QUOTE X) Y)%*.
.END
.TURN ON "←";
For example, look at the evaluation of %3(SET (QUOTE X)(PLUS X 1)).%*
Assume the current value of %3X%* is %32.
SET%* is a function; evaluate the arguments from left to right. The
value of %3(QUOTE X)%* is %3X%*; the value of %3(PLUS X 1)%* is %33%*; assign %33%* to %3X%*.
Question: what does %3(SET Z (PLUS X 1))%* mean? Well if the current value of
variable %3z%* (represented by %3Z%*) is an identifier (non-numeric atom), then
%3(SET Z (PLUS X 1))%* makes sense. Assume the current value of %3Z%* is %3A%*, then
the effect of the %3SET%* statement is to assign the value %33%* to %3A%*.
Normally when you are making assignments, you want to assign to a %2name%* and
not a %2value%*; thus you will be using the form
←%3(SET (QUOTE ...) ...).%*
As a convenience an abbreviation, ⊗→%3SETQ%*↔←, has been made available:
←%3(SETQ ... ... )%* means %3 (SET (QUOTE ...) ...).%*
Again note that %3SETQ%* is not (the translate of) a function. It may be
defined as a special form or consider as a notational convenience like %3list%*.
Here is a translation of the body of the %3prog%* version of %3length:%*
.GROUP
%3
.BEGIN TABIT3(10,17,20);
\(LAMBDA(X)
\\(PROG(L C)
\\\(SETQ L X)
\\\(SETQ C 0)
\\A\(COND ((NULL L)(RETURN C)))
\\\(SETQ L(CDR L))
\\\(SETQ C (ADD1 C))
\\\(GO A) ))
.END
.APART
%1
Now that assignment statements are out in the open let's reexamine "<=".
We already know ({yon(P89)}) that "<=" does more than simply associate the
right hand side with a symbol table entry of the left hand side; it must also
associate an environment with the function body, and this environment is to be
used for accessing global variables. This operation of associating environments
is called forming the %2⊗→closure↔←%*. We thus might be tempted to say:
.BEGIN CENTER;SELECT 3;TURN OFF "←";
f <= λ[[ ... ] ...] %1is%* f ← closure[λ[[ ...] ...] ].
.END
where %3closure[g]%* is to give as value a representation of the function %3g%*
associated with the %2name%* of the current environment
⊗↓We may %2not%* associate the current %2value%* of the
environment (i.e. symbol table). Why?←.
Alas, this implementation is still not sufficient as we will see in {yonss(P90)}.
.REQUIRE "PROB7.PUB" SOURCE_FILE;
.SS(In retrospect,,P90:)
We will begin with a sketch of the LISP evaluator as it would now appear:
basic %3eval%* plus the additional artifacts for %3label, function%*, and %3prog%*.
This evaluation process is very important and, though it is perhaps new material,
should be appear quite intuitive.
%3eval%* and friends are not to be memorized. If
you cannot write functions equivalent to %3eval%*, then you have not understood
LISP evaluation.
Finally, we will examine some of the weaknesses present in LISP.
There are alternatives to some of the LISP techniques and there are some things
which, in retrospect, LISP could have done better.
First, the basic evaluator.
There are two arguments to %3eval%*: a %2form%* ⊗↓throughout this section we
will say "form", "variable", "λ-expression", etc. rather than "an S-expression
representation of a" ... "form", "variable", "λ-expression", etc. No
confusion should result, but remember that we %2are%* speaking imprecisely.←, that
is an expression which can be evaluated; and second, an association list or
%2symbol table%*. If the form is a number, the atom %3T%* or %3NIL%*, return that
form. If the form is a variable, find the value of the variable in the current
table or environment. If the form is a (non numeric) S-expression constant, return
that constant. If the form is a conditional expression, then evaluate it
according to the semantics of conditional expressions. If the form is a
%3prog%*, evaluate the body of the %3prog%* after binding the local %3prog%* variables
to %3NIL%*⊗↓Perhaps you can now see why quoted expressions, conditional expressions,
and %3prog%*s are called %2special forms%*.←.
The form might also be a functional argument. In this case evaluation consists of
forming the ⊗→closure↔← of the function, i.e., associating the current environment
with the function. In LISP, this is done with the %3FUNARG%* device.
Any other form seen by %3eval%* is assumed to be a function followed by arguments.
The arguments are evaluated from left-to-right and the function is then applied
to these arguments.
The part of the evaluator which handles function application is called %3apply%*.
%3apply%* takes three arguments: a function, a list of evaluated arguments, and
the current symbol table. If the function is one of the five LISP primitives
then the appropriate action is carried out. If the function is a λ-expression
then bind the formal parameters (the λ-variables) to the evaluated arguments and
evaluate the body of the function. The function might also be the result of a
functional argument binding; in this case apply the function to the arguments
in the saved environment rather than in the current environment.
We might also be applying the label operator. All we need do here is apply the
function definition to the arguments in the current context augmented by the
binding of the function name to its definition.
We have saved what seems to be the simplest for last. That is the function
is named; what we do is look up the name of the function in the current environment.
To look something up means to find its value.
Currently we expect that value to be a λ-expression, which is then applied.
However, since function names are just variables, there is no reason that the
value of a function name could not be a simple value, say an atom, and
%2that%* value
applied to the arguments, etc. Indeed examination of %3apply%* on {yon(P69)}
will show that %3apply[X;#((A B))#;#((X#.#CAR)#...#)]%* %2will%* be handled correctly.
The obvious extension of this idea is to allow a %2⊗→computed function↔←%*. That
is, if the function passed to apply is not recognized as one of the preceding
cases, then evaluate the expresssion until it is recognized. Thus
we will allow such forms as:
.BEGIN CENTERIT;SELECT 3;
←((CAR#(QUOTE#(CAR#(A#.#B))))#(QUOTE#(A#.#B)))
.END
What conceptual difficulties are present in LISP evaluation?
One of the more important defects in LISP is its inadequate handling of function
valued variables, functional arguments being a particular case which LISP does
attend to correctly.
LISP was the first language to handle functional arguments so it is not too suprising
that all is not quite right.
The %3FUNARG%* device is sufficient for simple uses of functional arguments and
closures. However, we should like to return functions and closures as %2values%*.
Returning ⊗→open function↔←s is easy; we can simple %3cons%*-up λ-expressions.
Returning closures is more difficult. So perhaps we should finally lay "<=" to
rest. As you might have expected, difficulties occur when we examine recursive
definitions.
.P94:
Consider the canonical example:
.BEGIN CENTERIT;SELECT 3;
←fact <= λ[[x][x=0 → 1; T → *[x;fact[x-1]]]]
.END
First, we attempted to implement this as:
.BEGIN CENTER;SELECT 3;TURN OFF "←";
fact ← closure[λ[[x] ... fact[x-1] ..]
.END
we must use the %2name%* of the environment current at closure-time rather
than the value, since at the time we form the closure the name %3fact%* is
a free variable. Any value associated with %3fact%* at this time would be the
wrong one ⊗↓%2One%1 value would suffice; what is it?←. For example:
.BEGIN TABIT3(30,34,38);GROUP;TURN OFF "←";
\\E%4i%*
\\|
\E%4c%*\|\E%4a%*
\___________
\%3fact%*\|%3foo%*
Then executing %3fact ← closure[ ... ]%* should give:
\\E%4i%*
\\|
\E%4c%*\|\E%4a%*
\___________
\%3fact%*\|%3λ[[x] ...fact[x-1]] : %1E%4i%*
rather than:
\\E%4i%*
\\|
\E%4c%*\|\E%4a%*
\___________
\%3fact%*\|%3λ[[x] ...fact[x-1]] : fact|foo%*.
.END
You should reflect on the current development before reading further.
There are problems however if we just associate the name of the environment
in the closure operation.
For example, consider the following sequence:
.BEGIN TABIT3(30,34,38);GROUP;
An initial environment:
\\E%4i%*
\\|
\E%4c%*\|\E%4a%*
\___________
\%3fact%*\|%3λ[[x] ...fact[x-1]] : %1E%4i%*
.apart;
.GROUP;
Now execute %3foo <= fact%*, giving:
\\E%4i%*
\\|
\E%4c%*\|\E%4a%*
\___________
\%3fact%*\|%3λ[[x] ...fact[x-1]] : %1E%4i%*
\%3foo%*\|%3λ[[x] ...fact[x-1]] : %1E%4i%*
.END
.BEGIN TURN ON "#";
This state should start worrying you; the "intent" of %3foo#<=#fact%*
was to make %3foo%* synomymous with %3fact%*. That clearly is not the case
though the right thing happens if we were now to evaluate an expression
involving %3foo%*. The problem is that it happens for the wrong reason: the occurrence
of %3fact%* in the body of %3foo%* will find the right definition of %3fact%*.
One more step will lead to disaster: %3fact#<=#loser%*.
.END
Now we have really lost. Though it is perfectly reasonable to redefine %3fact%*
--it is only a name-- our intent was clearly to keep %3foo%* as a realization
of the factorial function. This intent has not been maintained ⊗↓%1This example
was shown to the author by D. B. Anderson.←. So at least in the presence of
recursive definitions we really must be careful.
The %2real%* intent of the recursive definition of %3fact%* was to define a
function to effect the computation of factorial and then to %2name%* that
function %3fact%*.
Or, put another way, the effect of
.BEGIN CENTER;
%3fact%* <= %3λ[[x] ...fact[x-1]]%1 is quite different from:
%3foo%* <= %3λ[[x] ...fact[x-1]].
.END
Clearly we failed to handle this general problem, though
LISP does handle its %3label%* subset correctly.
Let's see what's really going on.
Perhaps an easier avenue lies in looking first at assignments to %2simple%*
variables rather than functional variables. It is clear how the environment
should change during the execution of a sequence say:
.BEGIN TABIT1(30);TURN OFF "←";SELECT 3;GROUP;
\x ← 3
\y ← x
\x ← 4
.END
Let's try something slightly more complicated:
.BEGIN TABIT1(30);TURN OFF "←";SELECT 3;GROUP;
\x ← 1
\x ← x%82%* - 6
.END
Here we simply assign %31%* to %3x%*; then while we are evaluating the right
hand side of the next statement, we look up the current value of %3x%*,
perform the appropriate computations, and finally assign the value %3-5%*
to %3x%*. Compare %2this%* to the %3fact%* definition; there we made a point
of not "evaluating" %3fact%* on the right hand side of "<=". What %2is%* happening?
.BEGIN TURN ON "#";TURN OFF "←";
Notice that %2after%* an assignment statement has been executed, then
equality holds between the left-hand and right-hand quantities ⊗↓Indeed, a
logic of programs, due to C.A.R. Hoare, exploits this fact in interpreting
assignment.←.
Let's look more closely at the relationship between "←" and "=".
Consider %3x#=#x%82%*#-#6%1.
Interpreting this expression as an equation ⊗↓%1Not as an expression
whose value is true or false depending on the current value of %3x%*← we find
two solutions: let %3x%* have value %3-2%* or %33%*.
Now if we examine our "definition" of %3fact%* in %2this%* light, interpreting
"<=" as being related to "=" rather than "←", then we are faced with an equation
to be solved. Only now the form of the solution is a %2function%* which satisfies
the equation. There are frequently many solutions; there may be no solutions.
In the our case there is at least one solution: the usual factorial function.
So what we %2should%* write is something more like:
.BEGIN CENTER;SELECT 3;
fact ← %1a solution to the equation%*: f = λ[[x][x=0 → 1; T → *[x;f[x-1]]]];
.END
.END
Questions of when solutions exist, how many, and which are the "best" solutions are too
technical for this discussion, though we will say a bit more in {yonss(P85)}.
This discussion should make clear the necessity of distinguishing between
assignment and equality; fortunately, many programming languages use notations
which reinforce these differences.
.CENT(Problems);
I Write complete versions of %3eval%* and %3apply%*.
II The example of this section which points out the difficulties of closures
is not described in LISP. Can you write a pure LISP (i.e., without %3prog%*s)
program which will also exemplify this difficulty?
.SS(Towards a Mathematical Semantics,,P85:)
In {yonss(P15)} we informally introduced some ideas about proving properties
of programs. We would now like to expand on this idea of introducing
mathematical rigor into the discussion of programming languages. Though firm
technical basis can be established for the following discussion, we
wish to proceed at an intuitive level and hopefully arouse curiosity
without damaging the underlying theory.
First, why worry about foundational and theoretical work at all? Programming,
it is said, is an art and art needs no formalizing. Indeed, but there is some
science in Computer Science and %2that%* part can be analyzed.
We have borrowed heavily (but informally) from mathematics and logic; we should
at least see how much of our discussion can stand more formal analysis. We
have used the language of functions and functional composition, but have
noted that not all LISP expressions are to be given a usual mathematical conotation.
In particular, conditional expressions are %2not%* to be interpreted as functional
application. In mathematics, a function is a mapping or set of ordered pairs;
composition simply denotes a %2new%* mapping or set of ordered pairs. Nothing
is said about how to calculate members of the set of pairs, or indeed, whether
such a set can be effectively (mechanically) given. Can we give an interpretation
to LISP expressions which involves nothing more than the mathematical
description of function but preserves our LISP-ish intent?
Most of the description of LISP which we have given so far is classified as ⊗→operational↔←.
That means our informal description of LISP and our later description of the LISP
interpreter are couched in terms of "how does it work (operate)". Indeed
the whole purpose of %3eval%* was to describe explicitly what is to happen
when a LISP expression is to be evaluated. We have seen ({yon(P84)}) that discussion
of evaluation schemes is non-trivial; and that order of evaluation can effect the
outcome ({yon(P17)}).
.BEGIN TURN ON "#";
.P91:
However, many times the order of evaluation is immaterial
⊗↓One difficulty with the operational approach is that it (frequently)
specifies too much: C. Wadsworth.←.
We saw on {yon(P93)} that %3eval%* will perform without complaint when
given a form %3f[#...#]%* supplied with too many arguments.
How much of %3eval%* is "really" LISP and how much is "really" implementation.
On one hand we have said that the meaning of a LISP expression %2is%*
(by#definition) what %3eval%* will do to it. On the other hand
it is quite reasonable to claim %3eval%* is simply %2an implementation%*.
There are certainly other implementations; i.e, other functions %3eval%4i%1
which have the same input-output behavior as our %3eval%*.
The position here is not that %3eval%* is wrong in giving
values to things like %3cons[A;B;C]%*, but rather: just what is it that
%3eval%* implements?
This other way of looking at meaning in programming languages is exemplified by
⊗→denotational↔← or mathematical semantics.
.P96:
This perspective springs from a common mathematical, philosophical, or logical
device of distinguishing between a %2representation%* for an object, and the
object itself. The most usual guise is the numeral-number distinction.
Numerals are notations (syntax) for talking about %2numbers%* (semantics).
thus the Arabic %2numerals%* %32, 02%*, the Roman numeral %dII%*,
and the Binary notation %310%*, all %2denote%* or represent
the %2number%* two. In LISP, %3(A#B), (A#.(B)), (A,B)%* and %3(A#.(B#.#NIL))%*
all are notations for the same symbolic expression, thought of as an abstract object.
We could say that a LISP form %2denotes%* an sexpression (or is undefined) just
as we say in mathematics that 2+2 denotes 4 or 1/0 is undefined.
Thus %3car[cdr[(A#B)]]%* denotes %3B%* ⊗↓Or more precisely, denotes a symbolic
expression which we represent by %3B%*.←.
The scheme which we use to evaluate the expression
is irrelevant; there is some object which our expression denotes and the process
which we use to discover that object is of no concern.
Similarly, we will say that
the denotational counterpart of a LISP function or %3prog%* is just a
(mathematical) function or mapping defined over our abstract domain.
Before proceeding, discretion dictates the introduction of some conventions
to distinguish notation from %2de%*notation.
.BEGIN GROUP;
We will continue to use italics:
.BEGIN CENTER;
(%3A, B, ..., x, ... car, ...(A . B) %*) as notation in LISP expressions.
.END
Gothic bold-face:
.BEGIN CENTER;
(%dA, B, ..., x, ...car, ...(A . B)%*) will represent denotations.
.END
.END
Thus %3A%* is notation for %dA%*;
%3car[cdr[(A#B)]]%* denotes %dB%*; the mapping, %dcar%* is the denotation
of the LISP function %3car%*.
.BEGIN TURN OFF "⊗";
The operation of composition of LISP functions denotes mathematical composition;
thus in LISP, %3car[cdr[(A#B)]]%* means apply the function %3cdr%* to the
argument %3(A#B)%* getting %3(B)%*; apply the function %3car%* to %3(B)%*
getting %3B%*. Mathematically speaking, we have a mapping, %dcar%2⊗%*cdr%1
which is a composition of the %dcar%* and %dcdr%* mappings; the ordered
pair %d<(A#B)#,#B>%* is in the graph of this composed mapping.
%dcar%2⊗%*cdr(#(A#B)#)%1 is %dB%*.
.END
In this setting, any LISP characterization of a function is equally good;
the "efficiency" question has been abstracted
away. But notice that many important properties of real programs %2can%* be
discussed in this rarefied mathematical context;
in particular, questions of equivalence
and correctness of programs are still viable, and indeed, probably
more approachable.
Lest your confusion give way to dispair, we should point out that
denotational thinking %2has%* been introduced before.
When we said ({yon(P86)}) that:
.BEGIN CENTERIT;SELECT 3;
←f[a%41%*; ... a%4n%*] = eval[(F A%41%* ... A%4n%*) ...],
.END;
we are guilty of denotational thought. The left hand side of this equation
is denotational; the right hand side is operational.
This denotational-operational distinction is appropriate in a more general context.
When we are presented with someone's program and asked "what does it compute?"
we usually begin our investigation operationally, discovering "what does it %2do%*?".
Frequently by tracing its execution we can discover a concise (denotational) description
(E.g.,#"ah!#it#computes#the#square#root.").
.END
The author has already victimized you in this area of denotation and operation.
When %3great mother%* was presented it was given as an operational exercise,
with the primary intent to introduce the LISP evaluation process without
involving the stigma of complicated names. Forms involving %3great mother%* were
evaluated perhaps without understanding the denotation, but if asked "what does
%3great mother%* do?" you would be hard pressed to given a comprehensible
purely operational definition. However you might have discovered the true insidious
nature of %3great mother%* yourself; then it would be relatively easy to describe
its (her) behavior. Indeed, once the denotation of %3great mother%* has been discovered
questions like "What is the value of %3tgmoaf[(CAR (QUOTE (A . B)))]%*? " are usually
answered by using the denotation of %3tgmoaf%*: "what is the value of %3car[(A . B)]%*?"
⊗↓The question of how one gives a convincing argument that the successor of %3tgmoaf%*,
(i.e. %3eval%*), %2really does%1 faithfully represent LISP evaluation is the
subject of a recent dissertation by M.J.C. Gordon.←
Finally, for the more practically-minded: the care required in defining a
denotational model for LISP will pay dividends in motivating our
extension to LISP in {yonsec(P87)}.
Let us begin to relate these intuitions to our discussion of
LISP ⊗↓%1This section owes a great deal to M.J.C. Gordon's thesis.←.
We will parallel our development of interpreters for LISP since each subset,
%3tgmoaf, tgmoafr%*, and %3eval%*,
will also introduce new problems for our mathematical semantics.
Our first LISP subset considers functions, compostion, and constants.
Constants will be elements of our domain of interpretation.
Clearly, our domain will include
the S-expressions since %2most%* LISP expressions %2denote%* Sexprs. However, we
wish to include more; many of our LISP functions are partial functions.
It is convenient to be able to talk about the undefined value; in other words we
wish to extend our partial functions on sexprs to be %2total%* functions on
(denotations of) sexprs or "undefined". Our domain
must then include a denotation for "undefined". We will use the symbol %aU%*.
We shall call this extended domain %aS%* (%3≡%*%d<sexpr>%*∪%aU%*)
⊗↓%d<sexpr>%*'s are simply to be the denotations for elements in <sexpr>.
I.e.,<sexpr>'s are specific (syntactic) representations; %d<sexpr>%*'s
are abstract objects. Compare the numeral-number discussion on {yon(P96)}.
We could simply muddle the distinction here, but is worthwhile to make a clean
break.←.
Then, for example, the
mathematical counterpart to the LISP function %3car%* is the mapping %dcar%* from
%aS%* to %aS%* defined as follows:
.BEGIN TABIT2(10,20);GROUP
\%dcar:%aS%1→ %*S%1
\\ %1| is %aU%* if %dt%* is atomic
\%dcar(t)\%1< %dt%41%1 if %dt%* is %d(t%41%* . t%42%*)
\\ %1| is %aU%* if %dt%* is %aU%*.
.END
Similar strategy suffices to give denotations for the other primitive LISP functions
and predicates. For example:
.BEGIN TABIT2(10,20);GROUP
\%datom:%aS%1→ %*S%1
\\ %1| is %dNIL%* if %dt%* is not atomic.
\%datom(t)\%1< is %dT%* if %dt%* is atomic.
\\ %1| is %aU%* if %dt%* is %aU%*.
.END
.BEGIN TURN OFF "{","}";
Notice that %datom%* maps onto a subset of %aS%* rather than a set like {true, false,%aU%*} of
truth values. This behavior is in the spirit of LISP. LISP has already decided to
map truth-values onto the sexpressions %3T%* and %3NIL%*.
.END
Now, corresponding to %3tgmoaf%*, we will have a function, %9D%4tg%1, mapping expressions
onto their denotations. Thus:
.BEGIN TABIT2(30,35);GROUP;TURN ON "#";
\%9D%4tg%1:\<form> => %aS%*.###⊗↓%1Note the difference between "=>" and "→".
The former denotes a map from LISP constructions into denotations;
the latter a map between denotations.←
\\ %3car%1 => %dcar%*
\\ etc. for %3cdr, cons, eq,%1 and%* atom%1.
.END
.BEGIN TURN OFF "{","}";
Before giving the details of %9D%4tg%1, we need to introduce some notation for
naming elements of the sets <sexpr> and <form>. Let %as%* range over <sexpr>
and %af%* range over <form>, then using "{" and "}" to enclose LISP constructs
we can write:
.BEGIN centerit;GROUP;
←%9D%4tg%1{%as%*} = %ds%*
←%9D%4tg%1{%3car[%af%*]%1} = %dcar%*(%9D%4tg%1{%af%*})
←with similar entries for %3cdr, cons, eq, %1and %*atom%*.
.END
.END
The similarity with %3tgmoaf%* should be apparent.
.BEGIN TURN ON "#";
The structure of our denotation function, %9D%1, will obviously become more complex
as we proceed. Notice that the denotations for the LISP functions are mappings
such that if any argument is undefined then the value is undefined. That is
%df(#...,%aU%*,#...)%1 is %aU%*. Such mapping are called %2⊗→strict↔←%*.
This is as it should be: forms %3f[e%41%*,#...,#e%4n%*]%1 are undefined if
any %3e%4i%1's are undefined. As we will see, there are natural mappings which are
%2not%* strict; i.e., they can give a value other that %aU%* even when an argument
is %aU%*.
Let's now continue with the next subset of LISP, adding conditional
expressions to our discussion. As we noted on {yon(P88)}, a degree of care need
be taken if we attempt to interpret conditional expressions in terms of mappings.
First we simplify the problem slightly; it is easy to show that a general
LISP conditional can be expressed in terms of the more simple expression,
[p%41%*#→#e%41%*;#%3T%*#→#e%42%*]%1.
Now, using our extended domain, %aS%*, it %2is%* reasonable to think
of conditional expressions as function applications in the mathematical
sense ⊗↓Obviously, the order of evaluation of arguments to the conditional
expression will have been "abstracted out". But recall the comment of Wadsworth
({yon(P91)}).←.
Consider [p%41%*#→#e%41%*;#%3T%*#→#e%42%*]%1 as denoting %dcond(p%41%*,e%41%*,e%42%*)%1
where:
.BEGIN TABIT2(10,22);GROUP
\%dcond: %aS%*x%aS%*x%aS%* → %aS%*
\\ %1| is%* y %1if%* x %1is%* %dT%*
\%dcond(x,y,z)\%1< is %dz%1, if %dx%1 is %dNIL%*.
\\ %1| is %aU%1, otherwise
.END
.END
This interpretation of conditional expressions is appropriate for LISP; other
interpretations of conditionals are possible. For example:
.BEGIN TABIT2(10,24);GROUP
\%dcond%41%*: %aS%*x%aS%*x%aS%* → %aS%*
\\ %1| is%* y %1if%* x %1is%* %dT%*
\%dcond%41%*(x,y,z)\%1< is %dz%1, if %dx%1 is %dNIL%*.
\\ %1| is %aU%1 if %dx%* is %aU%* and %dy ≠ z%*
\\ %1| is %dy%1 if %dx%* is %aU%* and %dy = z%*
.END
Notice neither %dcond%* or %dcond%41%1 are strict mappings.
Now to add (simplified) conditionals to %9D%4tg%1, yielding %9D%4tgr%1:
.BEGIN TABIT1(12);TURN OFF "{","}";FILL;
\%9D%4tgr%1{%3[%af%41%3 → %af%42%3; T → %af%43%3]%1} =
%dcond(%9D%4tgr%1{%af%41%1}%d,%9D%4tgr%1{%af%42%1}%d,%9D%4tgr%1{%af%43%1}%d)%1
.END
As with the LISP evaluators, nothing particularly novel appears until we
begin consideration of variables. What is the denotation of a variable?
As we know, the value of a LISP <variable> ({yon(P66)}) depends on the
current environment or symbol table. Denotationally, things
really don't differ much from the discussion of %3eval%*. All we need
do ⊗↓This statement is a bit too optimistic!← is give a mathematical
representation of environments.
As a reasonable first attempt we can try characterizing environments as
functions from variables to sexpressions; i.e.:
.BEGIN CENTER
%denv %1is the set of functions: <%dvariable%*> → %aS%1.
.END
.BEGIN TURN ON "#";
Indeed, this is essentially the argument used in introducing %3assoc%* ({yonss(P92)}).
Note too, that %3assoc[x;l]#=#%dl(x)%1 is another instance of a
operational-denotational relationship. We will exploit this remark in a moment.
.END
.BEGIN TURN OFF "{","}";
So, given a LISP variable, %3x%*, and a member of %denv%*, say
the function %d{<x,2>,<y,4>}%*, then
our newest %9D%* should map %3x%* onto %d2%*. This is an intuitive way of saying
that %9D%* should map a member of <variable> onto a %2function%*. This function
should map a member of %denv%* onto an element of %aS%*. Now to make things more precise.
.END
.BEGIN TABIT2(30,35);TURN OFF "→";
\%9D%*:\<variable> => [%denv%* → %aS%*]
.END
.BEGIN TURN OFF "{","}";TURN ON "#";
Thus for example: %9D%*{%3x%*}(%d{<x,2>,<y,4>}%1)#=#%d2%*.
Introducing %av%* as a meta-variable to range over <variable>,
then for %dl#%9ε%d#env%1 we have:
.CENTER;
%9D%1{%av%*}%d(l)%*#=#%dl(v)%1
.END
We should then say that the %2meaning%* or denotation of a variable is a function;
whereas the %2value%* of a variable is an element of %aS%1.
Alas, our present description of %denv%* is inadequate. %denv%* is
to represent symbol tables; such tables must include function names
and their defintions. Function names, like <variable>s, are in the class
<identifier>; so in the interest of simplicity %denv%* should map from
%d<identifier>%* to ...; to what? What are the denotations of LISP
functions? Clearly they should be mathematical functions or mappings.
.BEGIN TURN ON "#";
So letting
%aFn%* be the set of functions:#%9S%4n=0%1[%aS%8n%1#→#%aS%1],
and %aId%1 be %d<identifier>%1∪%aU%1,
then a more realistic approximation to %denv%* is:
.BEGIN CENTER
%denv%1 is the set of functions: %aId%1 → %aS%1 ∪ %aFn%1.
.END
.END
That is, an element of %denv%* is a function which maps an identifier
either onto a s-expression or onto a function from s-expressions to s-expressions.
We shall soon see that even this is inadequate.
Meanwhile, let's continue expanding %9D%*.
We shall maintain the parallels between evaluation and denotation, by giving
%9D%4e%1 and %9D%4a%1 corresponding to %3eval%* and %3apply%*.
.BEGIN TABIT2(20,26);TURN OFF "→";GROUP;
Thus: \%9D%4e%1:\<form> => [%denv%* → %aS%*] and
\%9D%4a%1:\<function> => [%denv%* → %aFn%*].
.END
We must also make consistent modifications to the previous clauses of %9D%4tgr%1 to
account for environments. For example:
.BEGIN TURN OFF "{","}";TURN ON "#";
%9D%4e%1{%as%*}(%dl%*)#=#%ds%*. That is, the value of a constant is independent of the
specific environment in which it is evaluated. The simple modification for
conditional expressions is left to the reader.
.END
Some of the more obvious properties of %9D%4a%1 hold:
.BEGIN TURN OFF "{","}";
.BEGIN CENTER;
%9D%4a%1{%3car%1}%d(l)%1 = %dcar%*
.END
.BEGIN TURN ON "{","}";
Similar equations hold for the other LISP primitive functions and predicates.
To mimic look-up of a variable used as a function name we propose: ⊗↓This is
not a sufficient characterization.←
.END
.CENTER;
%9D%4a%1{%af%*}%d(l)%* = %dl(f)%*, where %af%* %9ε %1<function>.
.END
To describe the evaluation of a function-call in LISP we must add
an equation to %9D%4e%1:
.BEGIN TURN OFF "{","}";TABIT1(15);FILL;TURN ON "#";
\%9D%4e%1{%af%*[%as%41%1,#...,%as%4n%1]}%d(l)%1#=#
%9D%4a%1{%af%1}%d(l)(%9D%4e%1{%as%41%1}%d(l)%1,#...,%9D%4e%1{%as%4n%1}%d(l))%1
.END
Clearly, before we get very far in applying functions to values
we must give a mathematical characterization of function definitions.
We will do this in five (increasingly complex) steps. First
we handle λ-notation without free variables;
next, %3label%* definitions without free variables;
then λ-notation involving free variables;
explication of recursive definitions in denotational semantics comes next; and
finally we examine recursion in the presence of free variables.
First,
what should be the denotation of %3λ[[x%41%*, ..., x%4n%*] %9x%1]
⊗↓Assuming the only free variables in %9x%* are among the %3x%4i%*'s.← in an
environment? Intuitively, it should be a function, and recalling
({yon(P93)}) that %3eval%* expects n-ary
LISP functions be supplied with at %2least%* n arguments, it should
be a function from %aS%9*%1 to %aS%* such that:
.BEGIN TURN OFF "{","}";TABIT1(15);FILL;TURN ON "#";
\%9D%4a%1{λ[[%av%41%1,#...#%av%4n%1]%as%1]}%d(l)%1#=#
%d%9λ%d(x%41%*,#...,#%dx%4n%*)%9D%4e%1{%as%1}%d(l#:#<x%41%*,%dv%41%d>,#...,#<x%4n%*,%dv%4n%d>)%1
.END
.BEGIN TURN ON "#";
where λ is the LISP λ-notation and %9λ%* is its mathematical counterpart.
%dv%4i%1 is the denotational counterpart of %av%4i%1.
In more detail:
%9λ%d(x%41%*, ... ,x%4n%*)e(x%41%*, ... ,x%4n%*) %1is a function %df%*: %aS%8n%1 → %aS%* such that:
.BEGIN TABIT1(15);GROUP;
\ | is %de(t%41%*, ... ,t%4n%*) %1if m%c≥%*n and %c∀%*i%dt%4i%1 %c≠%* %aU%1.
%df(t%41%*, ... t%4m%*) %1\<
\ | is %aU%* otherwise
.END
Also, %d(l#:#<x%41%*,%dv%41%d>,#...,#<x%4n%*,%dv%4n%d>)%1 is a modification
of %dl%* such that each %dv%4i%1 is bound to to corresponding %dx%4i%1.
.BEGIN TABIT1(20);GROUP;
That is:
%d(l#:#<x,v>)%1 is: %9λ%d(v%41%*)%2\if %dv = %aU%* %2then %aU%2
\else if %dv%41%* = %aU%2
\else if %dv%41%* = v%2 then %dx%2
\else %dl(v%41%*)%1.
where: %2if%d p%* then %dq%* else %dr%1 is %dcond(p,q,r)%1.
.END
Now let's begin to clarify the mathematical content of the operator, "<=".
The device we use in LISP to name functions is %3label%*. Though {yonss(P90)} makes it
clear that a simple-minded approach is doomed, let's see how far it %2will%* go.
As before, we take our clues from the behavior of %3eval%* and %3apply%*.
In any environment %9D%4a%1 should map %3label[f;g]%* in such a way that the
denotation of %3f%* is synonymous with that of %3g%*.
That is, %df%* is a mapping satisfying the equation
%df(t%4i%*,#...,#t%4n%*)#=#%dg(t%4i%*,#...,#t%4n%*)%1.
So:
.END
.BEGIN TURN OFF "{","}";CENTERIT;FILL;TURN ON "#";
←%9D%4a%1{%3label[%af%1;%ag%*]}%d(l)%1#=#%9D%4a%1{%ag%1}%d(l)%1.
.END
This will suffice for our current λ-definitions; we need not modify %dl%*
since the name %3f%* will never be used within the evaluation of
an expression involving %3g%*.
.BEGIN TURN ON "#";
By now you should be getting suspicious that all is not quite right. We will
now justify your discomfort. First, our proposed description of the meaning
of λ-notation is a bit too cavalier. Our treatment of evaluation of free
variables in LISP (on {yon(P93)} and in {yonss(P77)}) shows that free
variables in a function are to be evaluated when the function is %2activated%*
and %2not%* when the function is defined. Thus a λ-definition generally
requires an environment in which to evaluate its free variables.
So its denotation
should be a mapping like: %denv%*#→#[%aS%9*%1#→#%aS%*] rather than
simply %aS%9*%1#→#%aS%1. Is this consistent with our understanding of the
meaning of λ-notation? Yes.
It is exactly what %2⊗→closure↔←%* was attempting to
describe. What we previously have called an ⊗→open function↔← is a creature of the
form:
%aS%9*%1#→#%aS%*.
.END
This enlightened view of λ-notation must change our conception on %denv%* as well.
Now, given a name for a function in an environment we can expect to receive
a mapping from %denv%* to an element of %aFn%*. I.e. for such names:
.BEGIN CENTERIT;
←%denv %1is the set of functions: %aId%1 → [%denv%1 → %aFn%1]
.END
We will let you ponder the significance of this statement for a few moments
while we continue the discussion of "<=".
A modification of our handling of %3label%* seems to describe the case
for recursion:
.BEGIN TURN OFF "{","}";CENTERIT;FILL;TURN ON "#";
←%9D%4a%1{%3label[%af%1;%ag%*]}%d(l)%1#=#%9D%4a%1{%ag%1}%d(l#:#<f,%9D%4a%1{%ag%1}%d>)%1.
.END
Interpreting this equation operationally, it says: when we apply a %3label%*
expression in an environment it is the same as applying the body of the definition
in an environment modified to associate the name with its definition.
This is analogous to what the LISP %3apply%* function will do.
Recalling the inadequacy of this interpretation of %3label%* in more
general contexts ({yon(P94)}),
we should perhaps look further for a general reading of %3label%*. Our hint
comes from our interpretation of "<=" as an equality. I.e., recall:
.BEGIN CENTER;SELECT 3;TURN OFF "←";
fact ← %1a solution to the equation:%* f = λ[[x][x=0 → 1; T → *[x;f[x-1]]]].
.END
What we need is a representation of an "equation-solver" which will
take such an equation as input and will return as value a function
which satisfies that equation. In particular we would like the %2best%*
solution in the sense that it requires the minimal structure on the function.
⊗↓Like a free group satisfies the group axioms, but imposes no other
requirements.←
This request for minimality translates to finding the function which
satisfies the equation, but is least-defined on other elements of the
domain. Though a full discussion of "least" brings in the recent work of D. Scott
and is a bit too advanced for our purposes, the intuition behind
this study is quite accessible and again illuminates the distinction
between mathematical meaning (denotational) and manipulative meaning (operational).
Consider the following LISP definition:
.BEGIN CENTER;SELECT 3;
f <= λ[[n][n=0 → 0; T → f[n-1] + 2*n -1]]
.END
.BEGIN TURN ON "#";
When we are asked what this function is doing, most of us would proceed
operationally; that is, we would begin computing %3f[n]%* for different
values of %3n%* --what#is#%3f[0]?%*,
what is %3f[1]%1,#...#. It is profitable to look at this process differently:
what we are doing is looking at a %2sequence%* of functions,
call them %df%4i%1's .
.END
.BEGIN TABIT3(10,16,44);SELECT d;GROUP;TURN OFF "{","}";
\f%40%1 =\%d{<0,%aU%*>,<1,%aU%*>, ... }\%1when we begin, we know nothing.
\%df%41%1 =\%d{<0,0>,<1,%aU%*>, ... }\%1now we know one pair.
\%df%42%1 =\%d{<0,0>,<1,1>, ... }\%1now two
\%df%43%1 =\%d{<0,0>,<1,1>,<2,4>, ... }\%1now three
\ ...\ ... ...\%dEureka!!
.END
When (if) the sudden realization strikes us that the LISP function is
"really" (denotationally) %dn%82%1 on the non-negative integers,
then we may either take
that insight on faith or subject it to proof. The process of discovering
the denotation can be construed as a limiting process which creates
the least-defined function satisfying the LISP definition. That is:
.BEGIN CENTER;SELECT d;
%9λ%d((n)n%82%d)%1 = %1least upper bound of%d f%4i%1's;
.END
where %df%4i%1 may also be characterised as:
.BEGIN TABIT1(12);SELECT D;group;
\ %1|%d n%82%1 if %d0%c≤%dn%c≤%di
f%4i(n)%1 =\<
\ | %aU%1 if %di%c<%dn
.END
We may think of our "equation-solver" as proceding in such a manner.
Though it is not at all obvious, such an equation solver
%2does%* exist; it is called the %2⊗→fixed-point operator↔←%* and is
designated here by %dY%*. We will now give an intuitive derivation of %dY%*.
In terms of our example we want a solution to %df = %9t%d(f)%1, where:
.BEGIN CENTER;
%9t%d(g) = %9λ%d((n) cond(n=0, 0, g(n-1)+2*n-1))%1,
.END
Our previous discussion leads us to believe that
%9λ%d((n)n%82%d) %1for %dn %c≥%d0%1 is the desired solution.
.BEGIN TURN ON "#";
How does this discussion relate to the sequence of functions %df%4i%1?
First, it's important to keep the domains of our various mapping in mind:
%df%*#%9ε%*#Fn and %9t%* is a functional in %aFn%1#→#%aFn%1.
Let's look at the behavior of %9t%* for various
arguments. The simplest function is the totally undefined function, %aU%*#⊗↓We
perhaps should subscript %aU%* to distinguish it from previous %aU%*'s.←.
.BEGIN CENTER;
%9t%d(%aU%*) = %9λ%d((n) cond(n=0, 0, %aU%*(n-1)+2*n-1))%1,
.END
but this says %9t%d(%aU%*)%1 is just %df%41%1!
Similarly,
.BEGIN CENTER;
%9t%d(%9t%d(%aU%*))#=#%9λ%d((n) cond(n=0, 0, f%41%*(n-1)+2*n-1))%1,
.END
is just %df%42%1.
Thus, writing %9t%8n%1 for the composition of %9t%1...%9t%1, n times, ⊗↓Define
%9t%80%1 to be the the totally undefined function, %aU%1.← we can say
.BEGIN CENTER;GROUP;
%df%4n%* = %9t%8n%d(%aU%*)%1 or,
%9λ%d((n)n%82%*)%1 = lim%4n=0 → ∞%9t%8n%d(%aU%d)%1.
.END
Or in general the fixed-point of an equation %df = %9t%*(f)%1 satisfies the
relation:
.BEGIN CENTER;
%dY(%9t%*) = lim%4n→∞%9t%8n%d(%aU%d).%1
.END
Thus in summary, %dY%* is a mapping:[%aFn%* → %aFn%*] → %aFn%*
such that if %9t%*#%9ε%*#[%aFn%*#→#%aFn%*] and %df%*#=#%9t%d(f)%1, then
%9t%d(Y(%9t%*))%1#=#%dY(%9t%*)%1 and %dY(%9t%*)%1 is least-defined of any of the solutions
to %df%*#=#%9t%d(f)%1.
.END
So the search for denotations for %3label%* might be better served by:
.BEGIN TURN OFF "{","}";CENTERIT;FILL;TURN ON "#";
←%9D%4a%1{%3label[%af%1;%ag%*]}%d(l)%1#=
#%dY(%9λ%d(h)%9D%4a%1{%ag%*}%d(l%1#:#%d<f,h>))%1.
.END
Which characterization of %3label[f;g]%* is "better"?
The first denotation parallels the behavior of %3eval%* and %3apply%*, applying
%3g%* in a %denv%* modified by the pair %d<f,g>%*.
The later fix-point characterization says %3label[f;g]%* is a particular
solution the the equation %3f=g%*. As we have seen, the "meaning" of %3label%*
is better served by this fix-point interpretation. The crucial questions, however,
are: first, are these two denotations different?
And which, if either, interpretation is %2correct%*?
That is, which
characterization has the same %2effect%* as %3eval%* and %3apply%*?
.BEGIN TURN OFF "{","}";TURN ON "#";
First, the characterizations are %2not%* the same. Examination of the
behavior of %9D%4e%1{%3label[car;car][(A#B)]%1} will exhibit a discrepancy.
.END
The general question of the correctness of the denotational semantics which we
are developing is the subject of much of M. Gordon's thesis.
***add lisp induction***
The intuitions presented in this section can be made very precise.
The natural ordering
of "less defined" exemplified by the sequence of %df%4i%1's: %df%4i%1 is less
defined (or approximates) %df%4j%1, j%c≥%1i, imposes a structure on our domain of functions.
Indeed, a structure can be naturally superimposed on all of our domains.
If we require that some additional simple properties hold on our domains, then
the intuitions of this section can be justified mathematically.
*** ADD MONOTONICITY, CONTINUITY
The papers of Scott, Gordon, and Wadsworth supply the missing precision.
In case you think least fixed points are too removed from the realities of
programming language design, please refer to {yon(P94)} again.
Though intuition finally uncovered a counterexample to the general application
of the specific implementation of LISP's %3label%*, intuition has been guilty
of superficiality here. The nature of recursion is a difficult question and
this fixpoint approach should make us aware of the pitfalls.
Actually we glossed over a different kind of fixed point a few moments
ago:
.BEGIN CENTERIT;
←%denv%1 is the set of functions %aId%1 → [%denv%1 → %aFn%1]
.END
This statement requires a fixed point interpretation to be meaningful.
Finally stirring denotations for simple variables back into our %denv%*,
we are led to an adequate description of environments for LISP:
.BEGIN CENTERIT;
←%denv%1 is the set of functions %aId%1 → [%denv%1 → %aFn%1∪%aS%1]
.END
**structure of domains**
**justification**
Recalling that this characterization of %denv%* arose in explicating
free variables, it should not come as too much of a suprise that
we must modify the fix-point characterization of %3label[f;g]%*.
We had assumed that %3f%* and %3g%* were in %aFn%*, whereas they are
in %denv%*→%aFn%1. %3label[f;g]%* binds %3f%* to %3g%* in an environment,
leaving free variables in %3g%* unassigned. These free variables of %3g%*
are evaluated in the environment current when the %3label%*-expression
is applied. Thus the meaning of a %3label%*-expression should also be a
member of %denv%*→%aFn%1. So:
.BEGIN TURN OFF "{","}";CENTERIT;FILL;TURN ON "#";
←%9D%4a%1{%3label[%af%1;%ag%*]}%d(l)%1#=
#%dY(%9λ%d(h)(%9λ%d(l%9*%d)(%9D%4a%1{%ag%*}%d(l%9*%1#:#%d<f,h>)) ))(l)%1.
.END
Notice that all our work on fixed-points and recursion equations has only
involved %2single%* equations. Indeed, the label operator can only define
a single function. Frequently a function needs to be defined via a %2set%*
of recursion equations. In particular when we wrote %3eval%* and %3apply%*
we were really asking for the solution to such a set of equations: %2⊗→mutual recursion↔← equations%*.
When we wrote:
.BEGIN TABIT1(30);SELECT 3;GROUP;
\eval <= λ[[e;a] ...
\apply <= λ[[fn;x;a] ...
.END
As we said in {yonss(P38)}, we really meant:
.BEGIN TABIT1(30);SELECT 3;GROUP;
\label[eval; λ[[e;a] ... ]
\label[apply; λ[[fn;x;a] ...]
%1 where %3eval%* and %3apply%* occur within the %3λ%*-expressions.
.END
That is:
.BEGIN TABIT1(30);SELECT 3;GROUP;
\label[eval; %9F%1(%3eval,apply%1)]
\%3label[apply; %9Q%1(%3eval,apply%1)]
.END
Now since LISP doesn't allow %3label%* to express mutual recursion directly,
we must resort to a subterfuge if we wish to express such constructions in LISP.
Namely:
.BEGIN CENTER;SELECT 3;
label[eval; %9F%1(%3eval,label[apply; %9Q%1(%3eval,apply%1)]%1)]
.END
Clearly this subterfuge is applicable to the definition of other
(mutually recursive) functions; but subterfuge, it is. To really be
useful, we realize that LISP must at least allow a %2sequence%* of
definitions to be given such that these definitions will be in effect
through some reasonable calculation.
There is minimal insight gained by thinking
of our LISP functions as anonymous solutions to a gigantic fixpoint equation.
Sequencing, thus is important. Sequencing is implicit in the order
of evaluation of arguments expressed in %3eval%*; sequencing is
implicit in the evaluation of ⊗→special form↔←s.
Certainly sequencing has become quite explicit by the time we examine %3prog%*.
All of these manifestations of sequencing have been abstracted out in the
denotational approach.
It is natural to ask whether there is some way to introduce
sequencing into our formalism so that more realistic implementations
can be proved correct.
***continuations: extensions to sequencing and order of evaluation***
After all this work there really should be a comparable return on
on our investment. We believe there is. An immediate benefit is
clearer understanding of the differences between mathematics and programming
languages.
A clearer perception of the role of definition and computation.
Later we will see that the thought required to specify domains and
ranges of functions is directly applicable to
the design of an extension to LISP which allows user-defined
data structures.
.CENT(Problems);
%2I%* What is the %2statement%* of the correctness of %3eval%* in terms
of the denotational semantics?
.SEC(Running on the machine)
%1
This section is for the brave few who wish to run on a machine.
The %2programming language%*, LISP, is the Sexpr translation
of the LISP functions and %3prog%*s that we have been writing. What are
some of the implications of writing in Sexpr form?
First, LISP becomes the world's largest parenthesis sink. It makes
for difficult reading at times, but there are formatting tricks, and
editing programs which will help the user match parens and locate
parts of the program. (It only hurts for a little while). There is
a practical benefit which greatly outweighs this anguish factor:
since proper input to LISP programs are Sexprs and since we are writing
LISP programs in Sexpr form then on input, data and program are
indistinguishable in format; i.e., the are both binary trees.
Obviously for evaluation, programs must have a very special structure,
but program and data are both trees just as in more hardware machines
the contents of locations which contain data are indistinguishable
in format from locations which contain instructions.
On the bare machine it is the way in which a location is accessed
which determines how the contents of that location are interpreted.
If the central processor accesses the location via the program
counter, the contents are assumed to be an instruction. That same
location can usually be accessed as part of a data manipulating
instruction. In that case, the contents are simply taken to be data.
LISP is one of the few high-level languages which also has this property.
It gives the programmer exceptional power. Let's complete the
analogy. The central processor here is our old friend, ⊗→%3eval%*↔←.
If %3eval%* references a binary tree via its `program counter', then
that tree is decoded via the internals of %3eval%*. If a tree is
referenced as input to an Sexpr-massaging function, then it is
taken as data. As a result, a LISP program can %3cons%*-up a new
Sexpr which can then be evaluated (i.e., interpreted as an intruction).
The simplest way to communicate with such a machine is to read an
sexpr translate of a LISP expression into memory, evaluate the expression, and
print the result. Several implementations of LISP use a slight variant
of this called the "read-eval-print" loop:
←%3prog[[]##a##print[eval[read[];NIL]];go[a]]%*.
.BEGIN TURN ON "#";
Since programming is done using the sexpr translation of LISP functions
it becomes convenient to sometimes say "...#the function %3CAR%*#..." or
"...#write a %3PROG%*#...",#... when we actually mean %3car%* and %3prog%*,#....
The actual LISP functions are written in the language generated by syntax equations
which we have been including. This language is called the meta-language, and
the LISP expressions called ⊗→M-expressions↔← or M-exprs.
The distinction between Mexprs and their Sexpr translations must not be forgotten.
.END
Though %3eval%* is the equivalent of the basic Cpu of the ⊗→LISP machine↔←,
there is another artifact in many versions of LISP to communicate
with the outside world. As with most pieces of I/O equipment,
it leaves something to be desired. Its name is %3evalquote%*.
.SS(%3evalquote%*,%3evalquote%*)
The basic structure of the %3evalquote%*-loop is:
%21.%* read in a (Sexpr representation of) function . I.e., a name
or a lambda expression.
%22.%* read in a list of (evaluated) arguments to be applied to the function.
%23.%* apply the function (of step %21.%*) to the argument list.
%24.%* print the value.
%25.%* go to step %21.%*
Thus the structure of the loop is essentially:
.BEGIN TABIT2(24,27);TURN OFF "←";
%3
.GROUP
prog[[fn;x;z]
\l\fn ← read[ ];
\\x ← read[ ];
\\z ← evalquote[fn;x];
\\print[z];
\\go[l]]]
%1where: %3evalquote <= λ[[fn;x]apply[fn;x;NIL]]
.APART
%*
or more concisely:
%3
.GROUP
\prog[[ ]
\ l print[evalquote[read[ ];read[ ]]];
\ go[l]]
.APART
%*
.END
.GROUP
where:
%21.%* the function, ⊗→%3read%*↔←, negotiates with an input device to read an
Sexpression.
%22.%* the function, ⊗→%3print%*↔←, prints the value of its argument on an output
device.
.APART
The details of %3read%* and %3print%* will appear when we discuss implementation
of LISP.
.BEGIN TURN ON "#";
Here's an example of the behavior of %3evalquote%*. Assume that the input
device contains %2CAR#((A#B))%*. Then the first %3read%* in %3evalquote%* gets
%2CAR%* (a string representing an Sexpr), and the second read gets %2((A#B))%*;
then %3apply%* gets called as:
%3
←apply[CAR;((A B));NIL].
%*
%3apply%* says that this is the same as evaluating
%3car[(A#B)]%*, and thus returns %3A%* as its answer, which is dutifully
printed by %3print%*.
.END
So %3evalquote%* can be used as a `desk calculator' for LISP.
If we wish to evaluate an expression %3f[a%41%*;#...#a%4n%*]%1 for
%3a%4i%1 constants, i.e. sexprs, then %3evalquote[F;#(a%41%*#...#a%4n%*)]%1
will do it for us. But the %3evalquote%*-loop will %2not%* evaluate arguments;
the a%4i%1's in the call on %3evalquote%* are not expressions, not translates
of constants, but are constants.
Typing %2CONS#((CAR#(QUOTE#(A)))#(QUOTE#(B)))%* to the %3evalquote%*-loop
will result in %3((CAR#(QUOTE#(A)))#.#(QUOTE#(B)))%*.
The %3evalquote%*-loop is an anomaly. It does not expect the usual representation
of a LISP form say %3(F e%41%* ... e%4n%*)%1 where the %3e%4i%1's
are to be evaluated. It expects a %2pair%* of sexprs representing a function
and %2evaluated%* arguments. The benefits of having "functional-notation" as
input and having arguments implicitly evaluated are greatly outweighed
by the confusion introduced by having two formats for LISP expressions, one for
the "top-level" and a different one everywhere else.
Certainly
before we can do any reasonable amount of calculation, we must be
able to define and name our own functions. The %3label%* operator, or
calling %3eval%* with an intial symbol table containing
our defintions will suffice, but this is not particularly elegant.
We would like our definitions to have some permanence.
A function (actually a %2special form%*, if you have been paying attention)
named ⊗→%3define%*↔←, whose effect is to add definitions to %3apply%*, has been
provided. The actual behavior of %3define%* will appear in the sections on
implementation.
.P51:
%3define%* is the name of a special form (its arguments are not evaluated)
of one argument, say %3x. x %*is a list of pairs: %3((u%41%* v%41%*) ... (u%4n%* v%4n%*)).%1
Each %3u%4i%1 is the name of a function and each %3v%4i%1 is the λ-expression
definition. Actually each %3u%4i%1 and %3v%4i%1 is the %2Sexpr translation%* of the
name and the function, but you knew that anyway. The effect of %3define%*
is to put the appropriate definitions in the symbol table.
For example we might wish to define the function which reverses a list as:
%3
.BEGIN NOFILL;TURN OFF "←";
.GROUP
reverse <= λ[[x] prog[[y;z]
y ← x;
z ← NIL;
a [null[y] → return [z]];
z ← cons[car[y];z];
y ← cdr[y];
go[a];]]
%1
.APART
Then the following would make this definition grist for the %3evalquote%* mill.
.GROUP
%3
DEFINE((
(REVERSE (LAMBDA (X)
(PROG (Y Z)
(SETQ Y X)
(SETQ Z NIL)
A(COND ((NULL Y)(RETURN Z)))
(SETQ Z (CONS (CAR Y) Z))
(SETQ Y (CDR Y))
(GO A) )))
))
%1
.APART
and subsequently:
%3REVERSE ((A B C)) %1would evaluate to: %3(C B A).%1
.END
%1
C. Weissman's LISP PRIMER is an excellent source of machine examples
and should be consulted now. Always remember that you are writing the sexpr representation
of LISP functions and expressions. Do not confuse the two.
.SS(A project: extensions to %3eval%*,%3eval%*,P67:)
.SELECT 1;
***h.s. about how wonderful LISP is for extenson***
.BEGIN TURN ON "#";
Consider a p%4i%* → e%4i%* component of a conditional expression. As
conditionals are now defined, e%4i%* must be a single expression to be evaluated.
In subsets of LISP without side-effects this is sufficient. It is, however,
convenient in practice, i.e., in LISP %2with%* side effects, to extend conditionals to include components of the
form: p%4i%*#→#e%4i1%*; ... e%4in%*.
This extended component is to be evaluated as follows: if p%4i%* is true, then
evaluate the e%4ij%*'s from left to right, with the value of the component to be
the value of e%4in%*.
For example, this feature, used in %3prog%*s would allow us to replace:
.BEGIN TABIT2(10,14);
\\ ....
\\[p%41%* → %3go[l]%1]
\\ ...
\%3l\%1e%41%*;
\\e%42%*;
\\ ...
\\%3go[m];
.END
with: [p%41%* → e%41%*;e%42%*; ... %3go[m]%*;]. This is certainly easier to read.
This extended conditional expression is available on versions of LISP 1.6 on the PDP-10.
Here is another cosmetic. Instead of requiring that the body of a λ-definition
be a single expression: %3λ[[#...#]f[#...#]]%*, allow bodies of the form:
%3λ[[#...#]f%41%*[#...#];#...#f%4n%*[#...#]]%1. The evaluation of such
should mean; bind as usual, evaluate the %3f%4i%1's from left to right, returning
as value %3f%4n%*[#...#]%1.
This next extension to %3eval%* was derived from the syntax of ⊗→MUDDLE↔←, ⊗→CONNIVER↔←, and
⊗→MICRO-PLANNER↔←.
We have seen that LISP calling sequences are of two varieties: functions calls,
evaluating %2all%* of the arguments; and special forms, evaluating %2none%* of the
arguments.
In an attempt to generalize this regime we might allow the evaluation of some
of the arguments and enlarge on the domain
of objects which can appear in the list of λ-variables.
We might partition the formal parameters into obligatory parameters, optional
parameters, and an excess collector.
Obligatory parameters %2must%* have corresponding actual parameters; optional
actual parameters are used if present, otherwise declared default
values are used; and
if there are more actual parameters than the formals encompassed by the first
two classes, then they are associated with the excess collector.
To be more precise consider the following possible BNF equations:
.BEGIN TABIT1(13);TURN OFF "←";
<varlist>\::=[<obligatory> <optional> <excess>]
<obligatory>\::= <par>; ...<par> | %cε%*
<optional>\::= "optional" <opn>; ... <opn> | %cε%*
<excess>\::= "excess" <par> | %cε%*
<par>\::= <variable> | @<variable>
<opn>\::= <par> | <par> ← <form>
.END
The associated semantics follows:
.BEGIN INDENT 0,10;TURN OFF "←";
%21.%* The formal parameters are to be bound to the actual parameters from
left to right (as usual).
%22.%* There must be an actual parameter for each obligatory parameter, and
if there is no excess collector there may not be more actual parameters than
formals. (There may be less if we have optionals.)
%23.%* If a <variable> in a formal parameter is preceeded by a "@", then
the corresponding actual parameter is %2not%* evaluated.
%24.%* We might run out of actual parameters while binding the optionals.
If we do, then we look at the remaining formal optionals.
If a formal is simply a <par> then bind it to %3NIL%*; if a formal is
@<variable> ← <form> then bind the <variable> to the <form>; or if the formal is
<variable> ← <form>, bind <variable> to the value of <form>.
%25.%* Finally, the excess collector is bound to a list of any remaining
actual parameters in the obvious way: if <par> is <variable> then form a list
of the values of the remainder; if it is @<variable>, bind <variable> to the
actual list. If there is no excess, bind to %3NIL%*.
.END
.BEGIN TURN OFF "←";
These same languages have extended %3prog%*-variables slightly, allowing
them to be initialized explicitly. If a %3prog%*-variable is atomic,
intialize it to %3NIL%*, as usual. If it is of the form <variable>#←#<form>
then initialize it to the value of the <form>.
.END
.END
.BEGIN TURN OFF "←";turn on "\"; tabs 10,20;
Here are some examples:
1. In the initialization of %3length%* on {yon(P45)}, we could write:
%3...#prog[[l#←#x;#c#←#0]#....%*
2. %3list%* could now be defined as: %3λ[[%1"excess"%* x]x]%*.
3. Consider the following definition:
.BEGIN NOFILL;
.group
\%3baz <= λ[\[x;%1@%*y;%1"optional"%* z; u ← 0; %1"excess"%* v]
\\print[x];
\\print[y];
\\print[z];
\\print[u];
\\print[v]].
.APART
%1Then a call of:
%3eval[(BAZ 2 (CAR (QUOTE (X Y)) 4 5 6 7 (CAR (QUOTE (A . B))));NIL]%* would print:
%3
2
(CAR(QUOTE (X Y))
4
5
(6 7 A)%*
(and return value: %3(6 7 A)%*.
Similarly defining;
.GROUP
\%3fii <= λ[\[x;y;%1"optional"%* z; u ← 0; %1"excess"%3 v]
\\print[x];
\\print[y];
\\print[z];
\\print[u];
\\print[v]].
%1and calling:
%3eval[(FII 2 (CAR (QUOTE (X Y)));NIL]%* prints:
%3
2
X
NIL
0
NIL.%*
.END
.END
←%2Problems%*
Design simple sexpr representations of these proposed constructs.
Make these extensions to %3eval%*. How useful are these syntactic
sugarings?
.SS(A project: Syntax-directed processes,syntax-directed)
%1
.REQUIRE "NOTES.SDP" SOURCE_FILE;
.SEC(Toward Implementation,symbol tables)
.TURN ON "←","α";
%1
*******diatribe about importance of LISP for implementors****
There are some rather gross defects in our symbol table mechanism.
First, (disregarding %3define%*) we have had to resort to subterfuge
to put the definitions of our functions in our symbol table.
Using %3label%* ({yonss(P38)}), or calling %3eval%* with an initial
symbol table, only defines the function for %2that%* particular call
on %3eval%*. When we finish the computation, the definition disappears.
From a practical point of view definitions should have some permanence.
An implemented version of LISP should know a
lot more functions than the five primitives. It should have a
reasonable class of arithmetic functions, and some commonly used
Sexpr functions (like %3length, assoc, subst, equal,%* etc.) all predefined.
If we do things right we should be able to use the mechanism for
predefining functions as the tool for adding our own definitions.
Second, the table look-up mechanism of %3assoc%*, though theoretically
sound, leaves something to be desired as far as efficiency is
concerned. Since we do wish to implement LISP, we should be concerned with
efficient operations. We will see that an efficient and powerful symbol table structure
can be described quite easily in LISP.
Third, we have already seen that special forms %3QUOTE%* and %3COND%* are
not evaluated in call-by-value style.
Currently the recognizers for special forms are encoded in %3eval%*, and
when an instance of such a form is seen the argument is passed %2unevaluated%*
to a function to perform the required operations.
It would be nice to extend this flexibility
to the user. We would like to have notation for adding λ-definitions of
special forms to our symbol table.
%3eval%* would then need a way of distinguishing a
λ-expression defining a function from a λ-expression defining a special form.
Also there are other calling sequences which are interesting and useful and
would be nice to have.
The current version of %3eval%* only handles function definitions. Thus
the current symbol need only store the λ-definition and does not
need explicit information saying it is a %2function%* definition.
Fourth, conceptually we could use a more powerful mechanism. Consider
%3car[car]%* as an expression to be evaluated. Perhaps it is best just
to say the expression is ill-formed, but if you knew that %3car%* also
was currently being used as a simple variable besides being a function,
then your intuition says the first occurrence of %3car%* is a function
name and the second occurrence is the simple variable name; and if the
current value of %3car%* was say, %3(A B)%* then %3car[car]%* should evaluate to %3A%*.
That is, context tells us which occurrence of %3car%* is a function
and which is a simple variable ⊗↓just as an evaluator for %3prog%* can
tell a reference to the label %3x%* from the %3prog%*-variable %3x%* in
%3#...#prog[[x;y]#..#x#f[..]#...#g[x#..#]#...#go[x].%* See {yon(P83)}.←.
The current %3eval%* %2will%* operate correctly on this example since
a recognizer for the function %3car%* is explicitly encoded in %3apply%*.
⊗↓For the same reason, the LISP obscenity %3λ[[lambda] ... ]%* will work.
Notice its sexpr representation is %3(LAMBDA#(LAMBDA)# ...#)%*←.
However, in general our current symbol table mechanism is not sufficient for
this interpretation. Forced to use %3assoc%*,
we would only find the %2first%* binding of a variable. It will either
be a λ-expression (function) or a simple value. Context will determine
how %3eval%* will interpret what is found.
Clearly what is needed is the ability to
associate more than one property with an atom. In the above example
%3car%* has (at least) two %2properties%* or %2attributes%*. It has a function
value and it has a simple value. Since %3car%* is a primitive function,
its function value is the location of the machine code to carry out
the %3car%* function; the simple value is the sexpr currently bound to
the variable, %3car%*.
.TURN OFF "α";
These last three desires, for permanence, generality, and for multiple properties,
can be handled by the same mechanism. Clearly we could continue using an
%3assoc%*-like table, though more complicated, storing information
about what kind of value each dotted pair is. However our desire for
efficiency would certainly not be satisfied. Rather, we will
design a new symbol table which can store with each atom sequences of values and
their types. This gives us multiple properties.
We will make the table %2global%*
to %3eval%* and %3apply%*; these functions will now implicitly refer to that table
and thus need not have an explicit symbol-table argument. This
gives us permanence.
We will designate an indicator for function definition and an indicator
for special form definition and expand %3eval%* and %3apply%* to recognize
these indicators. This gives us generality.
Most implementations of LISP allow this association of
arbitrary sequences of %6attribute-value%* pairs with an atom.
It is a very good idea.
How can we associate an arbitrary collection of objects with
something? With each atom we will
associate a list called the %6⊗→property-list↔←%* or p%6-list.%* But atoms
can't be represented as just any other list. Among other things,
we must be able to recognize the occurrence of atoms so that we can
implement the predicate, %3atom%*. So atoms are special lists: the
%3car%* (or first element) of the representation of each atom is an
indicator used exclusively for the beginning of atoms.
We will use %7α%* to designate the beginning of an atom.
The %3cdr%* of the representation of the atom is the property list.
The elements of the p-list are associated in pairs. The first element is
the attribute (or property or indicator); the next element is the value.
For example, here
is part of the atom-structure for %3car%* assuming %3car%* is also being
used as a simple variable and has current value, %3(A B)%*:
.group
.GROUP SKIP 6;
.P50:
←%2Atom-structure for %3CAR%1.
.apart
The indicator, ⊗→%3SUBR%*↔←, tells us that %3car%* is a machine
language function. The indicator, ⊗→%3VALUE%*↔←, tells us that %3car%* is
also being used as a simple variable.
The reason for not pointing directly at %3(A B)%* is described
in {yonss(P5)}.
It should now be clear how
to program the primitive predicate, %3atom%*: "is %3car%* of the argument
to ⊗→%3atom%*↔← the special atom-indicator?" If it is then %3atom%* gives
value %3T%*, otherwise %3atom%* gives value %3NIL%*.
.P29:
How about the representation of non-primitive functions?
Non-primitive functions are defined using λ-expressions. What we do is
define a new indicator, ⊗→%3EXPR%*↔←, which tells
the evaluator that the function is a λ-expression. For example,
here is part of the atom-structure for %3FACT%*, the Sexpr form
of the factorial function.
.<<atom struct for fact>>
.GROUP
.GROUP SKIP 15;
←%2Atom-structure for %3FACT%1.
.APART
This picture should tell you at least two things: one, that programs
are (almost) stored as binary trees, and two, it should tell you what the function,
⊗→%3define%*↔←, does. %3define%* ({yon(P51)}) puts the λ-definition under the indicator %3EXPR%*
on each of the named atoms (functions).
The world becomes much simpler if we store each atom uniquely.
This is the reason for saying "almost" above. That
is, every reference to the atom %3A%* is actually a pointer to the same
"binary tree", the binary tree whose %3car%*-part is the special atom indicator,
and whose %3cdr%*-part is the ⊗→p-list↔← for the atom %3A%*. Thus %3(A . A)%*, which
.GROUP
we have been representing as:
.BEGIN NOFILL;TURN ON "\";TABS 30;
\%7[~~~][~~~]
\%3 A A
%1
is more realistically represented as:
\%7[~~~][~~~]
\%1 to atom-structure for %3A%1.
.END
.APART
.P48:
So the internal structures of this implementation of LISP are not binary trees;
there are intersecting branches. Structures of this sort
we will call %2⊗→list structure↔←%*.
LISP thus deals generally with binary list structure.
Trees are a subset of list structures.
We will see in a moment that many of the
computations which we have been performing have also been generating
list structure.
Question: Assume we have the above dotted pair as a value of a variable %3x%*
and we wish to print the value of %3x%*.
Clearly we would hope to see "%3(A . A)%*" appear on the output device.
We would expect that the print routine, named %3print%*, would be given
a pointer to the dotted pair. %3print%* can recognize that it %2is%* a dotted
pair since %3car%* of it is not %7α%*.
But how can %3print%* distinguish %3(A . A)%* from %3(B . B)%* for example.
The pointer in the preceeding diagram will point to %3A%* in one case
and to %3B%* in the other but nothing in our representation tells us %2what%*
to print. The simplest thing to do is to store in
each atom-structure some information telling what the name of the atom is.
This is done with another indicator called ⊗→%3PNAME%*↔←, standing for
%2⊗→print-name↔←%*. Thus the atom %3BAZ%* will have at least the following:
.GROUP
.GROUP SKIP 6;
←%2Atom-structure for %3BAZ%1.
.APART
.TURN OFF "#";
The indicator is called %3PNAME%* because the print-name (or ⊗→p-name↔←)
of the atom is what the LISP output routine will print as the name of
the atom. %2BAZ##%* means a memory location
containing some encoding of the letters %2B%*, %2A%*, and %2Z%*.
The symbol, %2#%*, represents some non-printing character; thus we
are assuming a location can contain five characters.
We represent the print-name as a list so that we may allow atoms with
p-names of unbounded length. Thus the p-name for %3BLETCH%*,
would be:
.TURN ON "#";
.GROUP
.GROUP SKIP 6;
←%2P-name structure for %3BLETCH%1.
.APART
How do we get atoms stored uniquely? ⊗→%3read%*↔← does it. On reading
an atom name (which is the way almost all atoms come into existence),
the %3read%* program checks the current symbol table. If the atom does
not appear we add a new entry consisting of the p-list with a single
attribute-value pair ----#the#%3PNAME%*,#p-name#pair#---. If the atom %2does%*
appear, we bring back a pointer to the appropriate entry.
Before talking more about %3read, print%* and the other input-output functions in LISP,
we should discuss the storage of list structure. The implementation
described closely follows that for the PDP-10, though most of
the description is machine independent.
First, though Sexprs and now atoms are represented as binary list structure, there will
be quantities which we wish to represent in memory which are not
structures. For example, numeric constants, and the actual encoding of
print-names for atoms are not structures. These quantities will be stored
in an area called %2⊗→full word space↔← (⊗→FWS↔←)%*. Most of the business of
LISP is massaging of list structure and binary trees, however; such nodes, which have a
%3car%*-part and a %3cdr%*-part are stored in a separate area called
%2⊗→free-space↔← (⊗→FS↔←)%*.
The name, free space, will become apparent in a moment. Each machine
word in FS is divided into two parts; the left half contains a pointer to
the %3car%*-part and the right half contains a pointer to the %3cdr%*-part.
The pointers may point into FS or FWS. In more detail, consider this
representation of
.GROUP
←%3(A .(B . C)):%*
.GROUP SKIP 10;
←%2A representation of %3(A .(B . C))%1.
.APART
Obviously the actual addresses are irrelevant. If we replace the addresses with pointers
to the appropriate cells the resulting diagram will contain the same information.
The origin of the LISP "box notation" should be obvious.
We can now describe an implementation of %3eq%*.
Since atoms are stored uniquely, the predicate, ⊗→%3eq%*↔←,
need only check that its arguments are both atomic and both point
to the same binary tree or location in memory. In most implementations
the check for atom-ness is suppressed and a simple address comparison
is made. Non-atomic sexpessions are not usually stored uniquely. Thus
←%3eq[(A . B);(A . B)]%* is usually false, but
←%3eq[x;x] %*is usually true for any %3x%*.
Please note that we are %2not%* changing the definition of %3eq%*.
%3eq%* is still ⊗→undefined↔← for non-atomic arguments.
The preceding remarks deal only with the usual implementation of %3eq%*.
The implementation of %3car%* and %3cdr%* present no problem.
Simple pointer manipulation will suffice.
.GROUP
.GROUP SKIP 6;
←%2The effect of %3car%1.
.APART
.P28:
Finally, a remark about conditional expressions. Most versions of LISP
relax the behavior of the predicates, p%4i%*, appearing in conditional exressions
such that instead of testing for the values %3T%* and %3NIL%* we
test only for %3NIL%* and non-%3NIL%*. That is, anything other than %3NIL%*
satisfies the p%4i%*. This allows such coding tricks as:
.BEGIN SELECT 3;TABIT1(30);TURN OFF "←";
\[x ← cdr[x] → ...]
%1which is equivalent to:%*
\x ← cdr[x];
\[not[null[x]] → ...]
.SELECT 1;
(Recall that the value of "←" is the value of the RHS.)
.END
As with most coding tricks, they should be avoided like the plague.
Again, please note that this discussion is not a change to our definition
of conditional expression, but an implementation dependent feature.
.<<pic of NIL>>
.SKIP TO COLUMN 1;
.SS(A picture of the atom %3NIL%*,%3NIL%*)
We have been writing the atom %3NIL%* as:
.GROUP SKIP 12;
Where the atoms for %3PNAME%* and %3VALUE%* are represented as:
.GROUP SKIP 12;
More realistically we should represent %3NIL%* as:
.NEXT PAGE;
.SS(A first look at %3cons%1)
%1
The effect of the ⊗→%3cons%*↔← function is quite different from that
of the other primitive LISP functions. The other functions (or
predicates) manipulate existing Sexpressions, whereas %3cons%*
must construct a new sexpression from two existing sexprs.
That is, given representations of two sexprs, say %3x%* and %3y, cons[x;y]%*
is to get a new cell, put the representation of %3x%* in the %3car%*-part of
the cell and the representation of %3y%* in the %3cdr%*-part:
.BEGIN TURN ON "\";NOFILL;TABS 10,30,45;
.GROUP
result of %3cons[x;y]%*
\\%7[~~~~~][~~~~~]
\ %1rep. of %3x\\ %1rep. of %3y%1
.APART
.END
Where is %3cons%* to obtain these new cells? This is accomplished by the ⊗→free
storage area↔←. Within the free storage (FS) area resides all the cell
which have a %3car%*- and %3cdr%*- part. The cells which contain the actual
p-names we know reside in the full word space (FWS). At any point
in a LISP computation there are cells in the FS are which do not contain
any parts of the user's computation. In particular, when the computation
is begun, only the atom structure for the initial LISP symbol table
uses cells in the FS area. The remaining FS cells form a %2free-space list.%*
Whenever a %3cons%* needs a cell the first cell in the FS list is used and
the FS list is set to the %3cdr%* of the FS list.
As the computation continues, more and more cell are taken from the FS list.
When a %3cons%* operation asks for a cell and the FS list is empty, the
computation is suspended and the %2⊗→storage reclaimer↔←%* or %2⊗→garbage collector↔←%* is
called.
.SS(Garbage collection,garbage collection)
During the course of a computation, contents of cells which were taken
from the FS list often become unnecessary. For example during the evaluation
of something as simple as:
←%3(CONS (QUOTE A)(QUOTE B)),%* many cell are used:
%21.%* At least seven cells are needed just to read in the expression.
If some of the atoms are not present in the symbol table,
more cells will be needed.
%22.%* One cell will be needed to perform the %3cons%* operation.
After the computation is completed, none of the eight mentioned cells are needed.
They are garbage.
Why not simply return these `garbage cells' explicitly to the FS list?
Frequently it is very difficult to know just exactly which cells
%6are%* garbage. Indeed, experiments have been performed in which LISP
programmers were allowed to return `garbage' to the FS list themselves.
The results were disasterous; list structure, thought to be garbage
was returned to the FS list by the programmers, unaware it was still being
used by other computations. We will see where these difficulties arise
later.
Thus reclamation is passed to the LISP system. The %3cons%* function and
its FW space counterpart, %3fwcons%*, are the only functions allowed
to mainpulate the free storage lists. When either list becomes empty, the
garbage collector is called.
%2←The fundamental assumption behind garbage collection is:%*
.NARROW 10,10;
At any point in a LISP computation, all cells which contain
parts of the computation are reachable (by %3car-cdr%* chains)
through a fixed set of known cells or registers.
.WIDEN;
The first phase of the garbage collector, called the %2⊗→marking phase↔←%*,
marks all of the list structure which is currently active (reachable
through the above mentioned fixed cells). This should include all the
atoms in the symbol table and all the cells reachable by %3car-cdr%* chains
from any of these atoms. Also any partial computations which have been
generated must also be marked. Once all the active structure has been
marked, we proceed to the second phase, the %2⊗→sweep phase↔←.%*
Once the marking has been completed, we go linearly (sweep) through memory,
collecting all those cells which have not been marked. Again,
the assumption is that if cells are not marked in the first phase,
then they do not contain information necessary to the LISP computation.
These unmarked cells are chained together via their %3cdr%*-parts to form
a new FS list. The FS pointer is set to the beginning of this list;
similarly, the unmarked cells in FWS comprise the new FW list.
A more detailed discussion of this garbage collector
will be given when examine the details of implementation in {yonss(P26)}.
And a more complex algorithm is discussed in {yonss(P27)}.
Garbage collection appears to be a very complex and time-consuming
process. Indeed it is. As indicated above, there are alternatives in some
applications. If the data-structure manipulations are particularly simple
then leaving storage management in the programmer's hands might
suffice. However, LISP generates very intertwined structures.
Perhaps there is
another way to allow the system to handle storage management.
First notice that storage management becomes quite simple if there is no sharing
of sublists. The question of when to return a list to the free space list
is easily solved. However it is desirable to share substructure quite often.
Instead of using a garbage collector, we might associate a counter, called
a %2⊗→reference counter↔←%*, with each
list when it is built. In that counter we will store the number of references
to that list. Thus the counter will be initialized to 1 when the list is created.
Whenever the list
is shared we increase the counter by 1; whenever the list is no longer to
be shared by some list we decrease the counter by 1.
When the count goes to 0 we can put the cells of the list in the free space list.
One of the most glaring defects in this reference counter scheme is the
prohibition of circular lists. Consider the following sequence:
.BEGIN NOFILL;GROUP;TURN OFF"←";
%21.%* Manufacture a list,%3x%*: %3x ← (B I G M O T H E R L I S T)%1. Reference count is 1.
%22.%* Circularize it: %3x ← circle[x];%*. Reference count is now 2.
%23.%* Delete all references to %3x%*: %3x ← NIL%*. Reference count reverts to 1,
but the list is not referred to, is not on the free space list, and has thus
been lost to the system.
.END
←%2Problems%*
I This problem deals with what is known in LISP as %2⊗→hash consing↔←%*.
We have been storing atoms uniquely, but it should be clear from the behavior
of %3cons%* that non-atomic sexprs are %2not%* stored uniquely.
Certainly storing single copies of any sexpr would save space. For example,
the non-atomic structure of
%3((A . B).(A . B))%* could be represented with two cells rather than three.
Unique storage is not without its difficulties. What problems do you forsee
in implementing such a scheme?
II We said on {yon(P48)} that many LISP computations generate list structure
rather than true binary trees? Give an example.
.P55:
II Can you write a LISP function %3circle <= λ[[x] ... %* which will take
a list %3x%* and make it circular. Thus:
.BEGIN TABS 8,30,47,65;NOFILL;TURN ON "\";GROUP;
%7
\[~~~~~~~[~~~]\[~~~]~~~]\[~~]~~~]\[~~~~]~~~]
%3\NOTHING\ CAN\ GO\ WRONG
.END
This list is circular on its "last two" elements.
Needless to say printing such structures is not appreciated.
.SS(%3read%* and %3print%*,,P13:)
.TURN ON "←";
When you begin to implement LISP you find that a very significant
part of LISP can be written in LISP. We have already seen that the
evaluation process is expressible this way.
Here we will see that the majority of the I/O routines can be
written as LISP functions calling a very few primitive routines.
The primitive routines can also be described in `pseudo-LISP'.
That is, we will give a LISP-like description of them, though
they would normally be coded in machine language.
The primitive functions are ⊗→%3ratom%*↔← (read an atom) and ⊗→%3prin1%*↔←.
.BEGIN INDENT 0,10;
.GROUP
%3ratom[ ]%* is a function of no arguments. It reads the next atom
or special character (left paren, right paren or dot).
It looks up that object in the symbol table and returns
a pointer to that symbol table entry. If no entry
is found an appropriate entry is made.
%3ratom%* is the LISP ⊗→scanner↔←. %3ratom%* flushes spaces and commas,
using them only for delimiters. It returns only atoms or special characters
to %3read%*, the LISP ⊗→parser↔←.
.APART
.GROUP
%3prin1[x]%* is a function of one argument expecting an atom, left paren
right paren, blank, or dot as the value of its argument.
It will print the p-name of that object on the output
device.
.APART
.END
To make life simpler we need to be able to refer to atoms whose values are
the characters "%3)%*", "%3(%*", ".", and " "(blank).
We will assume that ⊗→%3rpar%*↔←, ⊗→%3lpar%*↔←,
⊗→%3period%*↔←, and ⊗→%3blank%*↔← are such atoms. For example, if the next input
character is "(" then
←%3eq[ratom[ ];lpar]%* is true (and the input pointer is moved to
the next character!).
%3prin1[period]%* will have the effect of printing a %2"."%* on the output
device.
We will discuss the structure of %3ratom%* and %3prin1%* in a moment. In
the meantime here are %3read%* and %3print%*:
.BEGIN TABIT2(17,20);TURN OFF"←";SELECT 3;
.GROUP
⊗→%3read%*↔← <=λ[[ ]prog[[j]
\j ← ratom[ ];
\[eq[j;lpar] → return[read1[ ]];
\ or[eq[j;rpar];eq[j;period]] → LOSEBIG1;
\ T → return[j]]]]
.APART
.GROUP
⊗→%3read1%*↔← <= λ[[ ]prog[[j;k]
\\j ← ratom[ ];
\\[eq[j;lpar] → return[cons[read1[ ];read1[ ]]];
\\ eq[j;rpar] → return[NIL];
\\ eq[j;period] → go[rd];
\\ T → return[cons[j;read1[ ]]] ];
\rd\k ← ratom[ ];
\\[eq[k;lpar] → k ← read1[ ];
\\ or[eq[k;rpar];eq[k;period]] → LOSEBIG2];
\\j ← ratom[ ];
\\[eq[j;rpar] → return[k];
\\ T → LOSEBIG3] ]]
.APART
.END
Here's %3print%* and friends. %3terpri%* gets a new output line.
(note: the value of %3print[x]%* is %3x%*.)
.BEGIN TABIT2(17,20);TURN OFF "←";SELECT 3;
.GROUP
⊗→%3print%*↔← <= λ[[x]prog[[ ]
\prin0[x];
\terpri[ ];
\return[x]]]
.APART
.GROUP
⊗→%3prin0%*↔← <= λ[[x]prog[[j]
\\[atom[x] → return[prin1[x]]];
\\j ← x;
\a3\prin0[car[j]];
\\[null[cdr[j]] → go[a6]]
\\prin1[blank];
\\[atom[cdr[j]] → go [p1]];
\\j ← cdr[j];
\\go[a3];
\p1\prin1[period];
\\prin1[blank];
\\prin1[cdr[j]];
\a6\prin1[rpar];
\\return[x] ]]
.APART
.END
So far we have thrown all the I/O back on %3ratom%* and %3prin1%*. Clearly
%3ratom%* will be more interesting. All %3prin1%* need do is get the p-name and
print it. %3ratom%* needs to search the symbol table (efficiently hopefully),
and if
the atom is not found, add it to the table. All %3ratom%* has to work
with is the actual character string (called %3chr-str%* in the future) which
will be the p-name of some atom. What %3ratom%* could do is look at the
p-name of each atom currently in the symbol table; when it finds a match
it returns a pointer to the beginning of that atom.
(Notice this is essentially the search scheme of %3assoc%*.)
If the appropriate
atom is not found it can build a new one consiting of the p-name, add it
to the table, and return a pointer to this new atom. This would
have the appropriate effect; each atom would be stored uniquely, but
the efficiency would leave something to be desired.
←%2Problems%*
***** non-recursive reader...better worse***
*** slashifying ***
.SS(Hashing,hashing,P14:)
Symbol table lookup is exactly the problem of looking up
words in a dictionary. The above scheme of %3assoc%* (linear search)
is analogous to a person beginning at page one of the dictionary and
proceeding linearly (page-by-page and word-by-word) through
the book until he found the word in question. Truly a losing
scheme. What we normally do is look at the first character of
word and go immediately to the subsection of the dictionary which
has the words beginning with that character. We know that if
we cannot find the defintion of our word in that subsection we need
look no further in the book. Usually we delimit our search even
further by keying on subsequent characters in the word. Finally
we may resort to linear search to pin down the word on a specific
page or column.
What we want is a similar scheme for the machine. We might in fact
mimic the dictionary search, subdividing our table into 26 subsections.
We can do better.
Since it is the machine which will subdivide and
index into the symbol table, we can pick a scheme which is computationally
convenient for the machine. Besides being convenient, the scheme should
result in rather even distribution of atoms in the subsections.
Obviously if the majority of the atoms end up in the same partition
of the table we will have gained little towards improving the
search efficiency. Now for terminology. Each of the partitions
or subdivisions of the table is called a %2⊗→bucket↔←%*. Each atom will
appear in at most one bucket. The computational scheme or function
used to determine which bucket a particular atom belongs in is called
a %2⊗→hashing function↔←%*. All ⊗→%3ratom%*↔← has to work with is %3chr-str%*, the
encoding of the actual name of the atom. So the hashing function
will use %3chr-str%* to determine which bucket to examine. If the atom
with ⊗→PNAME↔← %3chr-str%* does not appear in that bucket we are assured
that it does not appear anywhere in the table. In this case we
make up the minimal table entry --#atom#indicator, %3PNAME%*, p-name structure#--
and add it to that bucket.
There are lots of hashing functions. Here's one:
%21.%* Assume that we have N+1 buckets, numbered 0, 1, 2 ... N.
%22.%* Take the representation of %3chr-str%* (it's a number) and
divide it by N+1.
%23.%* Look at the remainder. It's a number between 0 and N.
%24.%* Use that remainder as the index to the appropriate bucket.
This is a scheme frequently used in LISP. Given the bucket number,
we then run down the list of atoms in that bucket, comparing
print-names against %3chr-str%*. If the atom is not found, %3cons%*-up
the atom and stick it in the bucket.
There is a lot of mystery and history about hashing and other means
of searching and storing in symbols. References are given in the
bibliography.
In review, here is the structure of the LISP input mechanism:
%21.%* %3read%*, the LISP ⊗→parser↔←, builds a list-structure representation
of the input string. %3read%* is defined in terms of %3ratom%*.
%22.%* %3ratom%*, the LISP ⊗→scanner↔←, builds atoms and special characters
from the input. It searchs and increments the LISP symbol table.
%23.%* The LISP symbol table, usually called ⊗→%3OBLIST%*↔← (for %2⊗→object list↔←%*),
is a list of buckets. Each bucket is a list of the atoms which `hash' to that bucket.
We actually represent the object list as an array named %3oblist%*.
This way the hash number will give us the array subscript and we can
go to the right bucket immediately.
We won't have to go "cdr-ing" down the object list to get to the right bucket.
See figure on next page.
Here is a `pseudo-LISP' version of ⊗→%3ratom%*↔←:
.BEGIN TABIT2(17,20);FILL;
%3hash%*\is a function returning the bucket number of its argument.
%3insert%*\is a function to build the atom and insert it into to bucket.
%3right-one%*\is a predicate used to check if an atom
has the right print-name.
.NOFILL
.TURN OFF "←";
.SELECT 3;
⊗→ratom↔← <= λ[[ ]prog[[z;i;bucket]
\\i ← hash[chr-str];
\\bucket ← oblist[i];
\a\[null[bucket] → return[insert[chr-str]]];
\\z ← get[car[bucket];PNAME];
\\[right-one[z;chr-str] → return[car[bucket]]];
\\bucket ← cdr[bucket];
\\go[a]]]
.END
%3get%* is a LISP system function. It is described on {yon(P52)}.
.SKIP TO COLUMN 1;
.SS(A review of the structure of the LISP machine,LISP machine)
.BEGIN TABIT2(10,20);GROUP;
\_______________________________________
\\⊗→%3eval%*↔← and friends
\\⊗→%3read%*↔← and ⊗→%3print%*↔←
\\the ⊗→garbage collector↔←
\\the base registers for marking
\\ ⊗→FS pointer↔←
\\ ⊗→FWS pointer↔←
\\ symbol table(⊗→%3OBLIST%*↔←) pointer
\\ registers for partial results
\_______________________________________
\\⊗→Free space↔←
\\ the free space list
\\ those parts of sexprs containing
\\ %3car%*- and %3cdr%*-parts.
\_______________________________________
\\⊗→Full word space↔←
\\ the full word space list
\\ atom print names
\\ numbers
\_______________________________________
.END
.NEXT PAGE;
.SS(More on symbol tables,symbol tables)
←%2Subtitled: But my Fortran program doesn't do all this crap!%1
It is quite true that a running Fortran, PL/1, or Algol program
is far less complicated as far as its symbol accessing mechanisms
are concerned. In Fortran there is a simple relationship between
variables and memory locations which will contain their values;
in Algol, there is a simple relationship between variables and positions
in the run-time stack.
In LISP, both the quality and the quantity of variables can change.
Arbitrary properties can be associated with atoms at run-time.
Indeed, the symbol table mechanism of LISP is more reminiscent of
that associated with the Fortran or Algol compiler. For these less
sophisticated languages it is the compiler which performs the mapping
from source language to running machine code. It is the compiler's
responsibility to discover the properties associated with each variable.
The compiler can do this because the semantics of the language is such
that at compile all (or almost all) properties of the variables are known.
This is not true for LISP. In general you cannot tell until run time what
the attributes of a particular atom are. The situation is really even worse
than this. Since programs and data are indistinguishable, we can %3cons%*-up a
list, assign it to a variable, then turn around and use that variable as
a function. Try that in Fortran.
Now the world isn't all this grim. There are LISP compilers (written
in LISP naturally). They can make many decisions at compile time
about the properties of variables But in general the compiled code
will be interspersed with calls on %3eval%*. That is, %3eval%* will have to make
some decisions at runtime, because the compiler just doesn't know what to do.
This implies that compiled and interpreted code must be able to
communicate with each other. If a piece of compiled code calls a
λ-expression or conversely, the right thing must happen. The execution
of the program should be totally transparent as to whether any, or all, or
none of the functions are compiled. This means that the calling
sequences for both kinds of functions must be compatible. Less
obvious and by far more troublesome, is the communication of global
values between functions.
.SS(Global Variables,global variables)
A variable used globally (or free) in a function (or %3prog%*) is one which
does not appear in the λ-list or in the list of %3prog%*-variables.
For example in:
←%3λ[[x]*[x;y]], y%* is global.
Communication of global variables between interpreted functions
is not problematic: just look down the atom structure for the %3VALUE%*-cell.
Handling of global variables by compiled functions requires some
care. The ⊗→%3VALUE%*-cell↔← mechanism is designed to accomplish this.
Essentially, the %3VALUE%*-cell will %2always%* contain the current value
of its variable. The compiler then needs to be able to find this cell and
generate efficient code to manipulate its contents.
What we will do now is:
%21.%* Introduce a graphical language, based on ⊗→AMBIT/G↔←, to describe
many of the operations of LISP ({yonss(P26)}).
%22.%* Describe parts of the dynamics of evaluation using the ⊗→Contour
Model↔← ({yonss(P31)}).
%23.%* Describe the mechanism for user-defined special forms, and another
defining mechanism called ⊗→macros↔← ({yonss(P18)}).
%24.%* Give a new improved definition of %3eval%* and Co. which uses the
new symbol table mechanism
and evaluates forms, special forms and macros.
Show how the ⊗→bind↔←ing mechanism, called
%2⊗→shallow binding↔←%*, operates ({yonss(P30)}).
%25.%* Introduce a simple stack machine, %aSM%*, mapping the graphical
descriptions onto a ⊗→PDP-10↔←-like machine ({yonss(P33)}).
%26.%* Write a LISP function which will operate as a ⊗→compiler↔←, translating
the Sexpr representations of LISP functions into `equivalent' sequences
of %aSM%* code ({yonss(P32)}).
%27.%* Then we can see how the problem of global variables is handled ({yonss(P5)}).
In parallel to %2this%* development will run a presentation of an
alternative binding strategy called %2⊗→deep binding↔←%*.
We will also present an alternative calling structure
which works well with deep bindings and is not so hardware
oriented as its competitor.
These alternatives are perhaps more intuitive implementations our original a-list
version of %3eval%* than is the shallow binding scheme. We will contrast their
relative efficiencies.
.SS(AMBIT/G,AMBIT/G,P26:)
.SELECT 1;
AMBIT/G⊗↓An ambit is half aminal, half hobitt. AMBIT/G also is an acronym
for Algebraic Manipulation By Identity Transformation/Graphical.←
is a graphical language for the description of both
data and algorithms. We will explore a few aspects of AMBIT,
using it only to describe most of the basic operation of LISP. AMBIT
is a powerful graphical language suitable for describing complicated
data structures and their manipulation. A crucial problem of non-numerical
applications is the structuring of the data. Indeed it is frequently the case
that the structuring of the data is the %2only%* problem; once the
proper representation has been established a very simple program to
manipulate the complex representation will suffice. Often
what appears to be a complicated algorithm is, in reality, a simple algorithm
operating on complex data.
A poor representation will lead to an inefficient program.
Therefore in AMBIT, programming is attempted only after the data design
has been completed.
Since the
interrelationships between data structures are inherently more static than
the executing of an algorithm, it is also beneficial to separate complex data
from simple program ⊗↓Perlis: "suppress the constant and display the variable".←.
Proofs of correctness of programs designed in this way should also be more easily
obtained. There are difficulties in designing programs in this manner; a
significant difficulty lies in the paucity of notation which we have available
for describing algorithms. Current notations for programming languages are derived
from linear mathematical notations. These notations are ill-suited for
describing data structure representations.
As a very simple case, functions in mathematics are single-valued, but often
in programming we would like to return more than one value. In particular we
would like to return values which can be used immediately as arguments to another
function. A generalized composition if you wish. Mathematical notation
is not conducive to such operations even though the operations is
quite natural and easily understood from the model of execution.
If we do not want to use a high level language then
the other common choice is to encode the
algorithm in machine language. The objections to this approach are clear: the
program is incomprehensible to other people, and all too frequently it is
poorly understood even by its creator.
When we think about writing a complex non-numerical program like a compiler,
a proof-checker, or a piece of a large system, we draw pictures to
help crystalize our ideas and to structure the data efficiently.
When faced with explaining a complex structure-manipulating
program to someone we draw a picture. Clearly then, a graphical notation
for programming is worth exploring. AMBIT is such a language.
First, the data representation in AMBIT is a generalization of the
%2box notation%* of LISP. We have already seen that it is often
helpful to draw pictures to understand the data representation
in a particular problem. Instead of requiring that we draw all
nodes in a tree as boxes, we might convey more of our intuition
by drawing nodes of different shapes to represent nodes which contain
different kinds of information. We have already begun to do
this on a small scale. Words (nodes) in Full Word Space
are drawn:###%7[~~~~~~]%*### whereas nodes in free space are
drawn:###%7[~~~~]~~~~]%*### even though as far as most machines are
concerned both kinds of nodes map into the same kind of memory cell.
AMBIT/G exploits this generalization of shapes.
For example ⊗↓These examples are taken from the formal definiton of BASEL.←
in a typed language containing types integer,boolean, structure,and real
we represent these modes as:
.BEGIN GROUP;centerit;
.GROUP SKIP 6;
←%2Example of type shapes
.END
.BEGIN TURN OFF "←";GROUP;
We might represent the result of executing of %3x ← 2%*, where %3x%* is of type %3int%*, as:
.GROUP SKIP 10;
.END
If the language also has type definition facilities, then we might represent
mode %3complex%* as:
.BEGIN GROUP;CENTERIT;
.GROUP SKIP 10;
←%2Mode of %3complex%*.
.END
.BEGIN GROUP;CENTERIT;
.GROUP SKIP 10;
←%3y %1is%* complex[1.0 ; 1.0]
.END
This is interesting and useful notation but in
itself gives us no new power. The innovation is that we can
also describe our algorithms as graphical transformations of the
data graphs. The basic statements of the language are %2pattern-match-
and-replacement%* rules. That is, if a designated pattern can be
found in the current state of the data graph, then replace it with
a new pattern. The only kind of replacement allowed is the %2swinging%*
of an arrow so that its head moves from one node to another.
Where the arrow head strikes a node is immaterial; the origin of the tail
%2is%* important.
The tails of the arrows are fixed (as are the number of data nodes
and arrows). The individual statements are combined into an
algorithm by success and failure conditions. If a pattern-match-replacement
is successful then we go to one AMBIT/G statement,
if the match fails then we go to another statement.
The other two components of an AMBIT/G program are declaration
and initialization. We declare the shapes which will be used in
the algorithm and declare the number and relative positions of the
arrows which can radiate from each type of node. The initialization
(which we usually suppress) sets the initial pattern of the data graph.
An example is long overdue:
.<<pic of AMBIT/G for car>>
.BEGIN GROUP;SELECT 2;centerit;
.GROUP SKIP 20;
←picture of AMBIT/G algorithm for %3car%*
.END
This algorithm is represents the action of the %3car%* routine. Arrows
which are suppressed are not needed for matching. The solid links
represent links which must be found if the rule is to succeed and the
dotted links represent links which are to be set (as solid links) if
the match succeedes. The exit labeled S is used if all goes as
planned.
The circle is a notational convenience: it matches any shape.
Thus in this example, %3car%* would succeed as long as AC1 points to a
cell in free space; otherwise we get an error.
Or here's a more complicated example:
.<< pic of reverse>>
.GROUP SKIP 10;
In this piece of program we have a new element: nodes or (solid) links
whose perimeter is circled are to be tested after the match has been made
and before any specified modifications are executed. If the tests fail
the the F exit is taken from the program.
The general execution of an AMBIT/G block can be described in the usual
paradigm: selection, testing, and construction.
First the data is selected. A match of the nodes and solid links is attempted.
If this match cannot be made, an error has occurred.
Next, the testing of circled components is done. If a test fails, then
the failure exit is taken.
Finally the constructions, designated by the dotted links, are performed
and the success exit is taken.
**** add technical here: data graph permanence, functionality, constranits
and general b.s.***
On following pages are AMBIT/G algorithms for ⊗→%3cdr%*↔←, ⊗→%3cons%*↔←,
⊗→%3eq%*↔←, and ⊗→%3atom%*↔←.
The following conventions pertain to these implementations:
%21.%* We assume AC1, AC2, ...,ACn are pointers to the n arguments
of a function (requiring n arguments).
%22.%* We assume that every function, on completion of its action, will set
AC1 to point to the value of that function.
.TURN ON "#";
There are three basic shapes:
##%7α~~~~]%*,##%7[~~~]~~~]%*## and ##%7[~~~~~~]%*## corresponding
to atom headers, free space and full word space objects. Pointers will be given a
separate shape:###%7A##%*; and when we describe the garbage collector
we will need a `marked-unmarked' shape: ##%7G%*## . One final point:
the `down arrow' on ##%7[~~~]~~~]%*## and ##%7[~~~~~~~]%*##, is usually implicit
in implementations. What does it represent?
******picture of shapes****
.BEGIN GROUP;CENTERIT;SELECT 2;
.GROUP SKIP 30;
←Picture of Shapes
.END
***review***
Graphical languages are a very natural and powerful means of describing processes.
As examples, the next sections will give two LISP garbage collectors, one
very natural, one more complicated but easily understood; then a AMBIT/G
description of parsing techniques due to T. Cheatham, called ⊗→syntactic dominoes↔←;
and finally an AMBIT/G description of the operators for the AI chestnut, called
the Monkey and Banannas problem.
.SS(garbage collection again,garbage collection)
Notes: The right-side arrows on ##%7α~~~]%*## and ##%7[~~~]~~~]%*##
are used by the garbage collector. First we sweep from
##%7[~~~~~~]%4TOP%1# to##%7[~~~~~~~~~]%4BOTTOM%1#,##
setting the side arrows to %7G%4NA%1#;
when we mark, we reset those nodes which are reachable to point to %7G%4A%1.
Then the final sweep phase will collect all those nodes still marked,
%7G%4NA%1## into a new free space list, pointed to by FSP .
%7α[~~~~]%4<name>%1## is the atom header; <name> need not be present.
%7A%4MKR%1## is a pointer used during the garbage collection.
%7A%4P%1## is a pointer used to save the cdr parts of lists during
the marking phase of the garbage collector.
%7[~~~~]%4TOP%1,## %7[~~~~~~]%4BOTTOM%1## are used to delimit the boundaries of
free space.
The ⊗→garbage collector↔← has been slightly simplified. We should mark
more than just ⊗→OBLIST↔← (the symbol table); for example, what
%7A%4AC1%1 and %7A%4AC2%1 point to should be marked. There are other
intermediate results which must be preserved; these will become
apparent as we proceed.
We must also be careful about the order in which the dotted lines are
`executed'; often we will number the arrows to indicate the order of
execution.
Following this introduction is the main structure of the first LISP garbage
collector. The heart of the collector is the marking algorithm.
Some care need be excercised here. Since we are marking binary list
structure in a sequential manner, we need to make a decision as to
whether to mark the ⊗→%3car%*↔←-part first or mark the ⊗→%3cdr%*↔←-part first.
Actually the order in which we do these operations isn't important.
What %2is%* important is that while we are marking the first-chosen
branch, we must remember where the other branch is so that we
can subsequently mark it. Also since this marking process is
obviously recursive -- branches may also have branches-- we
must be prepared to remember an arbitrary number of `other branches'.
The pointer, P, is used to help remember these pending branches.
As you chase the arrows in the AMBIT/G marking algorithm, notice
that:
%21.%* We always mark the %3car%*-branch first. This is usually called
pre-order traversal.
%22.%* As we prepare to examine the %3car%*-part of a tree we save a pointer
to that node of the tree, using P.
%33.%* After we have marked the %3car%*-part, we use the information saved by
P to traverse to %3cdr%*-part.
P is pointing to a sequence of cells linked together by their
`down arrows'. When we go down the %3car%*-part, we save the parent-node
in the cell currently being pointed to by P. Then we set P to it's
`down arrow' successor:
.BEGIN GROUP;centerit;
.GROUP SKIP 10;
←%2Push
.END
As long as the %3car%*-part of the structure we are marking points
into free space, we will continue to update P. If you look back
through the elements whcih we have added to P you will see a
history of those nodes whose %3car%*-parts have been marked, but whose
%3cdr%*-parts are still to be examined. When we finally terminate the
%3car%*-marking we will pick off the last node in P, decrement P, and
then traverse that substructure:
.BEGIN GROUP;centerit;
.GROUP SKIP 10;
←%2Pop
.END
Thus, P and its associated cells are behaving as a stack.
Recall that we are storing information as list structure and thus
have intersecting branches.
Now since we do have intersections, the marking process can be a little
clever. As soon as the marker comes across a previously marked
cell we know that everything below that point in the structure has already
been marked. We can then terminate the marker on that branch.
Here then, is the simple AMBIT/G garbage collector for LISP followed by
a partial description in LISP.
.CENT(A simple LISP garbage collector in AMBIT/G)
.NEXT PAGE
.NEXT PAGE
.CENT(A simple LISP garbage collector written in LISP)
We will now write a garbage collector in LISP to mark and sweep nodes
in FS and FWS.
It will have three main functions:
.BEGIN INDENT 0,10;
%3initialize[x;y]%* to initialize the marking device for each cell in the space from
%3x%* to %3y%*.
%3mark[l]%* to do the actual marking of a list %3l%*.
%3sweep[x;y]%* to collect all inaccessible cells in the space delimited
by %3x%* and %3y%*.
.END
%3initialize%* will be called twice; once for FS and once for FWS.
%3mark%* will be called for each base register which points to
active (binary) list structure.
The basic structure of the marker is: if the word is in FWS mark it and leave;
if the word has already been marked simply leave. Otherwise the word is
in FS and thus has a %3car%* and a %3cdr%*; mark the word; mark the %3car%*;
mark the %3cdr%*. The marker is thus recursive; all of the stack manipulation
done by the AMBIT program will be hidden by the recursion.
%3sweep%* will be called twice; once to generate a new free space list and once
to generate a new full word space list. Elements of these free
lists will be chained together by their %3cdr%* parts.
The initialization and sweep phases of this garbage collector are very similar;
both phases must be assured of reaching every node in the space. We used the
%3down%*-arrows for this.
These main functions use several other functions and predicates:
.BEGIN INDENT 0,10;
%3fwswrdp[x]%* is true just in the case that %3x%* is a word in FWS. This is
used as one of the termination conditions of %3mark%*.
%3makeA[x]%* marks word %3x%* as accessible.
%3makeNA[x]%* marks word %3x%* as not accessible.
%3NAp[x]%* is true if word %3x%* is not accessible.
%3down[x]%* follows the "down" link of the node %3x%*.
%3conca[x;y]%* attaches node %3x%* the the front of the list %3y%*.
The value returned is the new list.
%3conca%*
is easily described in AMBIT. Can you write %3conca%* as a LISP function?
.END
.BEGIN; CENTERIT;GROUP;SELECT 2;
.GROUP SKIP 7;
←AMBIT diagam for %3conca%*.
.END
.BEGIN SELECT 3;TABIT2(18,20);TURN OFF "←";
initialize <= λ[[x;y]
\prog[[]
\ a\makeNA[x];
\\x ← down[x];
\\[eq[x;y] → return[T]]
\\go [a]]]
.APART;
.GROUP;
mark <= λ[[l] [\not[NAp[l]] → T;
\fwswrdp[l] → makeA[l];
\T → makeA[l];mark[car[l]];mark[cdr[l]] ]]
.APART
.GROUP
sweep <= λ[[x;y]
\prog[[z]
\ a\[NAp[x] → z← conca[ x;z]];
\\x ← down[x];
\\[eq[x;y] → return [z]];
\\go[a]]]
.END
.CENT(A link-bending garbage collector for LISP)
The description of this very simple collector is easily understood either as
a LISP function or as an AMBIT program. The collector we are about to describe
is more naturally described graphically. First notice that the AMBIT program
uses an explicit stack to store traversing information whereas in the LISP
description the use of a stack is implicit in the recursion of %3mark%*.
The use of a stack is one of the difficulties associated with this simple
collector. Garbage collection is invoked when available space has become
exhausted, but here we are, asking for %2more%* space to use for stacking.
The usual solution is to allocate a separate area for stack storage. This
has its drawbacks. If we don't allocate enough stack space, i.e., if the
depth of a piece of structure becomes too great, then the marker will fail.
Notice to that the amount of stack space can become large; proportional to the
length of a list.
.GROUP SKIP 8;
Another means of dispensing with a stack when traversing a tree is to
use a more complicated representation for each node. Notice that
if we examined "snapshots" of the execution of the traversal of a tree
as long as we traversed the tree each time in the same order, the contents
of the stack would be identical ⊗↓except for perhaps the actual machine locations
used in the stack←. Instead of replicating this portion of a stack on each
traversal we could save the stack information with the nodes in the tree.
This technique is called ⊗→threading↔← ⊗↓See Knuth's Kompendium←.
Threading complicated the representation and causes some difficulties when
we wish to store list-structure rather than trees. However the idea
of dispensing with the stack %2is%* worthwhile.
We can strike a comproimise. Instead of permanently storing
the threads in the structure we can modify the structure as we traverse it,
use the threads to mark, and then to restore the structure to its original
topology.
Certainly the marker will be more complicated, but the complication is
not insurmountable. In fact, it is even reasonably straightforward to prove
the correctness of such an algorithm ⊗↓ W. deRoever, private communication.←.
Finally, here is such a "link-bending" marker written in AMBIT/G:
.NEXT PAGE;
To reinforce ideas, here is a simple data graph and a sketch of the
modifications to it as the marker operates:
.NEXT PAGE;
.SS(Syntactic dominoes)
The most important parts of our discussions involve data structures, i.e.,
sexprs already in memory. An important, but peripheral problem is the input
such text, taking the linear representation of sexpressions into the
binary list-structure representation. We have mentioned, while discussing %3read%*
and friends, that such a program is called a ⊗→parser↔←. The general techniques
which are used by parsers are easily described using a game called
syntactic dominoes. This clever device was invented by T. Cheatham and
is described in AMBIT/G.
.SS(The Contour Model,Contour Model,P31:)
Certain phases of the execution of LISP programs, in particular function
calls and variable bindings are better described in another graphical
language, the Contour Model.
This model, though well known intuitively by any translator writer, was
recently made explicit and embellished by J. Johnston. Its ability
to clearly show the binding mechanisms in block-structured languages
is exceptional.
Note first that this model is a different type than AMBIT/G. The
Contour Model is essentially a trace of the execution of a specific program, whereas
AMBIT/G is a reasonable candidate for describing the semantics of a
complete language. Indeed, AMBIT/G has been used to document the extensible
language, BASEL, and LISP.
The simplest way to introduce the Contour Model is by example:
.SKIP TO COLUMN 1;
.NEXT PAGE;
.SS(Macros and special forms,macros,P18:)
Most of the discussion to this point has dealt with getting
a new efficient symbol table which can fulfill our requirements
of permanence and multiplicity of properties. Little has been said
about user controlled function calling, which we called the desire for
generality. Before introducing the definition of the new evaluator
we shall deal with that aspect.
Recall our discussion of ⊗→special forms↔← in {yonss(P8}).
Special forms have been used for two purposes: to control the evaluation
of arguments (conditional expressions, quoted expressions,
%3and, or%*, etc.), and to create the effect of functions with an indefinite
number of arguments (%3list, append, plus, ...%*)
Frequently this second application of special forms can be improved upon,
particularly in the presence of a compiler.
Describing such functions as special forms is sufficient,
but not efficient, since the body of the definition must
contain explicit calls on %3eval%*. Even though we wish to define
some functions as if they had an arbitrary number of arguments, when
we %6use%* (i.e. call) the function, the number of arguments to be
applied %6is%* known. In particular, when the compiler is examining the
program, the number of arguments is known. Hopefully the compiler can be
made smart enough to compile better code than calls on %3eval%*.
That hope is well-founded.
Assume, for example, we wish to define %3plus%* as a function with
an indefinite number of arguments:
.BEGIN CENTERIT;SELECT 3;GROUP;
←plus[4;5] = 9
←plus[4;5;6] = 15
←plus[4;add1[2];4] = 11
.END
That is %3plus%* to have the properties of a function: its arguments
are to be evaluated (from left to right); but it can take an arbitrary number.
What is needed here seems clear: define %3plus%* in terms of a %2binary%*
addition function, %3*plus%*.
.BEGIN CENTERIT;SELECT 3;GROUP;
←plus[4;5] = *plus[4;5] = 9
←plus[4;5;6] = *plus[4;*plus[5;6]] = 15
←plus[4;add1[2];4] = *plus[4;*plus[add1[2];4]] = 11
.END
That is, we %2expand%* calls on %3plus%* into a composition of calls on
%3*plus%*.
%3plus%* is being used as a %2macro%* and the expansion process in terms
of %3*plus%* is called %2macro expansion%*. Notice that is macro expansion
can be done by a compiler, generating a sequence of calls on %3*plus%*.
Realize too, that since LISP programs must perform equivalently when
interpreted, we must recognize a macro construction inside %3eval%*.
How are macros written and how do they work? The body of a macro is a
λ-expression of %2one%* argument. The %2use%* of a macro looks just like
an ordinary function call, but what is bound to the λ-variable is the whole
call on the macro. When you look at the evaluation mechanism for
macros in the %aSM%*-%3eval%* you will see that the result of the macro
expansion is passed to %3eval%*. Thus the task of the macro body is to
expand the macro call and return this expansion as its value.
The task of the compiler will be quite similar; it will expand the macro
but instead of evaluating the expansion it must %2compile%* it.
.BEGIN TURN ON "#";
Let's define %3<%4m%*=%1 to mean "#is#defined#to#be#the#macro#...".
Then a simple macro definition of %3plus%* might be:
.END
.BEGIN SELECT 3;TABIT2(10,25);GROUP;
.P58:
\plus <%4m%*= λ[[l]\[eq[length[l];3] → cons[*PLUS;cdr[l]]
\\ T → list[*PLUS;cadr[l];cons[PLUS;cddr[l]]]] ]
.END
Thus a call %3(PLUS 3 4 5)%* would bind %3l%* to %3(PLUS 3 4 5)%* and
the evaluation of the body would result in %3(*PLUS 3 (PLUS 4 5))%*.
Evaluation of this expression would result in another call on the macro.
This time %3l%* would be bound to %3(PLUS 4 5)%*. Now
%3eq[length[l];3]%* %2is%* true and the value returned is %3(*PLUS 4 5)%*.
This can be evaluated, giving 9, and finally the outermost call on %3*PLUS%*
has all its arguments evaluated, and we get the final answer, 12.
This is a very simple (and inefficient) macro definition. Since the body
of a macro has available all of the evaluation mechanism of LISP, and
since the program structure of LISP is also the data structure, we can
perform arbitrary computations inside the expansion of the macro. Truly
LISP macros have the power to cloud men's minds!!!
Notice that %3SETQ%* can easily be defined as a macro
over %3SET%*:
.BEGIN CENTERIT;SELECT 3;GROUP;
←setq <%4m%*= λ[[m] list[SET;list[QUOTE;cadr[l]];caddr[l]]].
.END
.P54:
The effect of "<%4m%*=" should be clear:
it puts the body of the macro definition on the property-list of the macro
name under the indicator, %3MACRO%*. Likewise "<=" puts the function body
on the p-list under the indicator, %3EXPR%*. Similarly, we will define
"<%4f%*=" to mean "..is defined to be a special form..." and whose effect
is to put the body under the indicator, %3FEXPR%*. The body of a fexpr
definition is a λ-expression of (usually) a single λ-variable, say %3l%*. When the
fexpr is called we bind the list of %2unevaluated%* arguments to %3l%*.
Thus we could define %3plus%* by:
.BEGIN SELECT 3; GROUP;TABIT2(10,12);turn off "←";
.P59:
plus <%4f%*= λ[\[l] prog[[sum]
\\sum ← 0;
\a\[null[l] → return[sum]];
\\sum ← *plus[sum;eval[car[l]]];
\\l ← cdr[l];
\\go[a]]]
.END
Or %3and%* could be defined as:
.BEGIN SELECT 3;CENTERIT;GROUP;TABIT2(20,35);
←and <%4f%*= λ[[l]evand[l]]%1
where %3evand%* is defined as:
%3
\evand <= λ[[l][\null[l] → T;
\\eval[car[l]] → evand[cdr[l]];
\\T → NIL]]
%*
.END
Notice that this definition of %3evand%1 differs from the one on
{yon(P53)}. The earlier definition required an explicit symbol table;
the newer one uses the global table. This is a mixed blessing. As usual
there are difficulties in getting the correct bindings of variables.
Most applications of %3FEXPR%*s involve explicit calls on %3eval%*
within the body of the %3FEXPR%*.
Consider the following sequence :
.BEGIN TABIT2(20,37);SELECT 3;TURN OFF "←";
\wrong <%4f%*= λ[[x] prog[[y]
\\y ← 2;
\\return[eval[car[x]]]]
\y ← 0;
\wrong[y];
.END;
Execution of the above sequence show the value of %3wrong[y]%* to be %32%*,
rather than the expected %30%*. Clearly, the problem is that the call on
%3eval%* takes place in the wrong environment. To alleviate this situation
%3FEXPR%*s may be defined with %2two%* arguments. In this case a %2call%*
on the %3FEXPR%* will bind the environment %2at the point of call%* to that
second argument. %3eval%*,and %3apply%* are allowed to be called with either
one or two arguments. If a second argument is present, it is used as the
environment during evaluation.
Thus:
.BEGIN TABIT2(20,39);SELECT 3;TURN OFF "←";
\right <%4f%*= λ[[x;a] prog[[y]
\\y ← 2;
\\return[eval[car[x];a]]]
\y ← 0;
\right[y];
.END;
The call on %3right%* will use the environment with %3y%* being %30%*
rather than %32%*.
Macros can be used to perform the operations which special forms
do, but with perhaps more efficiency.
The idea of macro processing is not recent. Some of the earliest assemblers
had extensive macro facilities. Lately macros have been used as a means
of extending so-called high level languages.
One of the most simple kinds of macros is %2textual substitution%*.
That is, when a use of a macro is detected we simply replace the call by its
body. A slightly more sophisticated appliation is the %2syntax macro%*.
Everytime we come across an application of a syntax macro the expander
processes it as if it had never been seen before even though much
of the expansion is repetitious. That is, syntax macros have no memory.
%2computational macros%* are an attempt to reduce some of this repetition.
In this scheme a certain amount of processing is done at the time the macro
is %2defined%*. For example a computational subtree refelcting the body of the
macro might be formed. Then whenever the macro is %2used%* we can
simply make a copy of this subtree and "glue" this subtree into the parse-tree
which we are building. This computational subtree is commonly formed by passing
the body of the macro through the compiler in a
"funny" way. The main problem with the computational macro is that there are
many desirable macros which have no such subtree, or there is other information
necessary to process the macro. There are solutions to this problem,
one of which closely parallels the abstract syntax ideas of McCarthy.
All of these styles of macros are subsets of the LISP macros.
Many versions of LISP also have a very simple syntax macro (called ⊗→%3read%* macros↔←)
independent of
the <%4m%*= - facility. Constants appear quite frequently in LISP expressions;
and so we have already introduced abbreviations for the sexpression
representation of numbers, %3T%* and %3NIL%*. That is we don't %3QUOTE%* them.
%3read%* macros are used to abbreviate other sexpressions. The most
common %3read%* macro is used to abbreviate the construct:
.TURN ON "←";
←%3(QUOTE %*<sexpression>%3)%*.
A special character is chosen to designate the macro, say "@". Then
then whenever %3read%* sees "@" the next sexpression, s, is read in and
the value: %3(QUOTE %1s%*)%1 is returned by %3read%*. Thus:
←@<sexpression> is an abbreviation for %3(QUOTE <sexpression>).%1
In some systems the user may define his own %3read%* macros, in others
they are pre-defined.
.SS(Problems)
I. Define %3list%* and %3append%* as macros. You may use only
the LISP primitives (functions, predicates, conditionals and
recursion) and a binary function, %3*append%*.
II. Give a macro definition of an extended %3SETQ%*, which is called
as %3(SETQ#%1var%41%*#exp%41%*#...#var%4n%*#exp%4n%3)%1.
Each var%4i%* is a name; each exp%4i%* is an expression to be evaluated
and assigned to var%4i%1. The assignments should go from "left-to-right".
Thus %3(SETQ X 2 Y (TIMES 2 X) X 3)%* when executed should assign
%33%* to %3X%* and %34%* to %3Y%*.
III. Define %3list%* as a fexpr.
.SS(%aSM%*-%3eval%*,%3eval%*,P30:)
Finally, here is the new evaluator.
(the `line numbers' are not part of the definitions)
.BEGIN VERBATIM;SELECT 3;
eval <=
1. λ[[x]prog[[y]
2. return[[numberp[x] → x;
3. atom[x] → [y ← get[car[x];VALUE] → cdr[y]]
T → err[UNBOUND VARIABLE]];
4. atom[car[x]] → [y ← getl[car[x];(EXPR FEXPR MACRO)]
5. → [eq[car[y];EXPR] → apply[cadr[y];mapcar[function[eval];cdr[x]]];
6. eq[car[y];FEXPR] → apply[cadr[y];list[cdr[x]]];
7. T → eval[apply[cadr[y];list[x]]]];
8. y ← get[car[x];VALUE] → eval[cons[cdr[y];cdr[x]]];
9. T → err[UNDEFINED FUNCTION]];
10. T → apply[car[x];mapcar[function[eval];cdr[x]]]]]]]
apply <=
λ[[fn;args]
11. [atom[fn] → [get[fn;EXPR] → apply[get[fn;EXPR];args];
12. T → apply[eval[fn];args]];
13. eq[car[fn;LAMBDA] → prog[[z]
14. bind[cadr[fn];args];
15. z ← eval[caddr[fn]];
16. unbind[];
17. return[z]];
18. T → apply[eval[fn];args] ]]
.END;
First let's describe the new LISP system functions which are used here.
They are:
.BEGIN INDENT 0,10;
.p52:
%3get[x;y]: %*
%3x%* is an atom; %3y%* is an indicator. ⊗→%3get%*↔← will return the value-part
of the attribute-value pair
associated with %3y%* under the atom %3x%*. If %3x%* does not have the
indicator %3y%*, then %3NIL%* is returned.
.END
.BEGIN INDENT 0,10;
%3getl[x;y]:%*
%3x%* is an atom; %3y%* is a list of indicators (see line 4). ⊗→%3getl%*↔←
will search the property list of the atom %3x%* for the first occurrence
of an indicator which appears in the list %3y%*. If such a match
is found, then the remainder of the p-list, beginning with the
matching indicator, is returned. If no match is found, %3NIL%* is
returned.
.END
.BEGIN CENTERIT;GROUP;
An example:
.GROUP SKIP 6;
←%3getl[FOO;(BAZ PNAME)]%* has value: %3get[FOO;PNAME]%* has value:
.END
The virtue of %3getl%* is that it can distinguish between an atom which has
an indicator with value %3NIL%* and an atom which does not have the indicator.
%3get%* cannot communicate this distinction.
First, on lines 5 and 10 we use the LISP mapping function, %3mapcar%*, to evaluate the argument
list during function calls. See {yon(P34)}.
Notice line %33.%* is an application of an obscene LISP coding trick
described on {yon(P28)}. Notice too, that line %33%* uses %3get%*. You
should understand why %3get%* works and %3getl%* is not required.
There are enough important new details in this %3eval-apply%* family
to warrant spending a bit of time. Most of the changes involve symbol table
operations. %3eval%* (and %3apply%*) no longer carry an explicit symbol table.
The property-list of tha atom contains the information now.
In this new scheme of things, %3get%* and %3getl%* search the symbol table
(p-list) for the appropriate bindings.
%3eval%* first checks if it has been given a number; any other atomic
expression given to %3eval%* is expected to be a variable and to
have a ⊗→%3VALUE%*-cell↔←.
If the %3car%* of the expression given to %3eval%* is atomic, then the
p-list of that atom is examined for one of three indicators.
We have already seen ⊗→%3EXPR%*↔← on {yon(P29)}, it is the indicator designating
a function represented as a λ-expression.
Evaluate the arguments and call %3apply%*. The indicator ⊗→%3FEXPR%*↔←,
designates a special
form. Pass the %2un%*evaluated argument list to %3apply%*.
The indicator, %3MACRO%*, is recognized on line %37%*.
There are actually two other indicators recognized by %3eval%*. ⊗→%3SUBR%*↔← is
the machine-language version of an %3EXPR%*; and ⊗→%3FSUBR%*↔← is the machine
version of a %3FEXPR%*.
.SS(Binding revisited,bind,P78:)
The most interesting changes in the evaluator involve binding
and unbinding of variables.
We have seen that the old symbol table mechanism has the correct
effect for the proper evaluation of recursive functions. As
we bound variables to values on entry to a λ-expression, we
%3cons%*ed the new bindings onto the front of the symbol table. This
had two results:
%21.%* In the evaluation of the body of the λ-expression we got the
correct bindings of the variables.
%22.%* The old value was saved so that we could retrieve it on leaving
the λ-body.
Now with the new table organization we need to develop the same
facility. It is not sufficient for the binding of λ-variables to
simply change the contents of the ⊗→%3VALUE%*-cell↔←. This would violate
condition %22%*. Besides changing the value, we must save the prior
value.
The crucial phrases are lines 14 and 16 in the new version of %3apply%*.
⊗→%3bind%*↔← is a function, taking two arguments. The first is the list of
λ-variables; the second is the list of new values. %3bind%* saves the
old values and rebinds the λ-variables to the new values.
⊗→%3unbind%*↔← is a function of no arguments used to restore a list
of λ-variables to their previous values. How is
this wonderous feat performed?
We have a ⊗→stack↔←, or ⊗→pushdown-list↔←, called SP
(for %2S%*pecial %2P%*ushdown) in which we can
save old values of variables. %3bind%* will add values to SP; %3unbind%*
restores the saved values. Actually because of the stack-like
behavior of the binding and unbinding processes we can increase
the efficiency of %3bind%* and %3unbind%*. %3bind%* saves both the value
and the location of the %3VALUE%*-cell for each variable being rebound.
It also puts special marks in the SP stack delimiting each block
of λ-rebindings. All %3unbind%* need do is restore the first block
of saved values. It knows the values and also knows the addresses of
the %3VALUE%*-cells.
All of the information it needed for restoration is saved in the stack.
.BEGIN GROUP;TABIT3(30,45,60);
Given %1λ[[x%41%*; ...; x%4n%*]%9x%1][a%41%*; ... ;a%4n%*], here's an example of binding:
\ to value\\old value of x%41%*
\ cell for x%41%*
\save
\and\ ...
\rebind
\ ===>
\ to value\\old value of x%4n%*
\ cell for x%4n%*
\\ |
\\ ↓
\\rebind x%4i%* to a%4i%*, changing the
\\ contents of the value cell
\\ (see %3*bind%*)
\\ |
\\ ↓
\\ evaluate %9x%*
\\ |
\\ ↓
\\restore old values using information
\\ stored in satck, SP.
\\ (see %3*unbind%*)
.END
.BEGIN CENTERIT;
.GROUP SKIP 15;
←%2The primitives, %3*bind%* and %3*unbind%*.
.END
This binding scheme, called %2⊗→shallow binding↔←%* works quite well for
simple λ-binding and lookup. As with the previous a-list symbol table
we need excercise some care when handling functional arguments. How can we
implement the equivalent of the %3FUNARG%* hack in this new binding and symbol table
organization? Recall how the old %3eval%* coped ({yon(P79)}).
When we recognized an instance of a functional argument we saved a pointer to
the current envirnoment. When we %2applied%*
the functional argument we restored the symbol table in such a way
that global variables were accessed in the saved environment while
local variables were found in the current evnironment.
We must do the same. When %3function%* is seen save the current SP pointer;
when we apply the functional argument,
.BEGIN centerit;
%3←(FUNARG %*<function> <old SP>%*) %1to arg%41%*; ... arg%4n%3 ,
.END
we will first rebind all of the variables found by the old SP pointer to
those old values, while saving the current values in SP. Then we can rebind the
λ-variables (i.e., local variables) of the <function> to arg%41%* through
arg%4n%*. The environment will then be properly restored for evaluation of the
function body. When the evaluation has been completed we restore using %3unbind%*.
Compare the shallow binding strategy to that of the a-list. The a-list was always
explicit when evaluation was occurring; saving environments only required saving
a pointer to the current symbol table; changing envirnoments at most required
calling %3eval%* with a different symbol table, and when leaving a context
the old table was restored implicitly by the recursion mechanism. Accessing
a variable involved an appreciable computation; we had to search the table
for the variable-value pair.
Shallow binding obviates the symbol table search while incurring added
expense in binding and unbinding.
The care and feeding of functional arguments is more difficult since there
is only one symbol table, not the tree of tables implicit in the
previous incarnation. True, the information necessary to recover an environment
is present in SP, but it is expensive to retrieve it.
Shallow binding: 0; a-list: 1; (Christians: 0; lions: 1).
.<<PROBS OF BINDING AND FEXPRS>>
.BEGIN TURN ON "←";GROUP;
←%2Problems%*
I Evaluate the following using the new %3eval%*:
1. %3eval[((LAMBDA(L)(PRINT L))(QUOTE A))]%* assuming %3print%* is the usual printing function.
2. %3eval[(FOO (QUOTE A))]%* where %3foo <= λ[[l]print[l]]%*.
3. %3eval[((CAR (QUOTE (FOO)))(QUOTE A))]%*, %3foo%* as above.
4. %3eval[(FOO1 (QUOTE A))]%* where %3foo1 <%4f%*= λ[[l]print[l]]%1.
and finally:
5. %3eval[((CAR (QUOTE (FOO1)))(QUOTE A))]%*.
Explain what happened. Can you "fix" it?
II Some implementations of LISP distinguish between %3EXPR%*s and %3FEXPR%*s
by modifying the prefix, %3LAMBDA%*. %3FEXPR%*s are stored with prefix %3NLAMBDA%*;
%3EXPR%*s are stored with the normal prefix %3LAMBDA%*. What are the
advantages and/or disadvantages of this scheme.
III Recall the extensions to LISP binding proposed in {yonss(P67)}.
Discuss their implementation in light of the new %3eval%* and symbol table.
.END
.SS(Review)
As things presently stand we have a basic LISP machine consisting
of an encoding of %3eval%* and friends, the I/O routines, the garbage
collector, an initial symbol table which believes in the primitive
functions and some commonly needed utility functions. We have
two areas of memory: free space, and full word space.
Expressions (forms) to be evaluated are read in, converted to list structure,
and evaluated by %3eval%*. What %3eval%* is doing is traversing the
sexpression representation of the form, interpreting the
information found there as LISP "instructions". This is an expensive process.
In the general ↓
CgJAQQKeJ↓SfA]=iQS]≤AoJA
C\AI<Ai↑AIKIkG∀AiQSLAG←gPXAEkPAmKed~∃Me∃ckK]QYrAi!KeJA%fABA5CaaS9NAMe=ZAiQ∀A→∪'@AKqaIKggS=]fAi<AEJ@4∃KmC1kCiK⊂Ai↑A∧AgKcUK]GJ↓←LAC
ikCX↓[CGQ%]JAS9giek
iS←]LXAoQ%GPAo!K\~∃
CeeS∃HA←kPAoSY0AQCm∀AiQJ↓gC[J↓G←[aUiCiS=]CXA∃MMKGPACf@∀gKmC0JTXA5CggC≥S]N~)iQJA1SghAMiekGQkeJ\A)QSLA[CaAS]NA%fAGC1YKHA∧@,3G=[aSY∃d/>\@JgKYCXJT4∃C]H↓MeSK9IfACIJAGC1YKHA¬\@Jg%]iKeAeKiKHJT\@~∀~∃]QChA]JAoSMPAi↑↓I↑A]=nASf↓IKgGISEJA∧AG←[ASYKd↓M←dA∧AgkEMKhA←_A→∪'@\~∃)!SfAQ¬fA[C9rA[←QSmCi%←]ft4∀~∀JHb\JTA)↑A⊃KgGe%EJAi!JAG←5aSYKHAoJA5kghA
CeKMUYYrA⊃KMS]∀AiQJ↓[CGQ%]J~∃]QSGP↓oSYX↓KqKGUiJAi!JAS]MiekGQS←]f↓oQSG AiQJ↓G←[a%YJAaI←IkG∃f\~∃∃qC[S9CiS←8A←LA∧A[CG!S]JA]SYXAIKS]M=eGJAQQJAS⊃KCfA=LAgi¬GWfX↓[CWJ4∃[←e∀AG←]
eKiJ↓iQJAMiekGQkeJA=LAeK
kegS=\XA←_AOCe COJA
←YYK
iS←\↓C]H~)eKae∃gK]i¬iS←\↓←LAg∃qaef↓C]HA1SghAMiekGQkeJ\4∀~∀JHd\JTA∪hA]SYX@↓gQ←n↓BAmKIrA]←8[ieSYSCXA¬aaYS
CiS←8A←LA9←\[]U[KeS
CX~∃
←[akQCiS←8\Aβh↓iQJAMC[JAQS[JA]JAoS1XAgK∀AQ←n↓gS[a1JASh↓SfAi<~∃IKMGeSE∀AgkG AG←[AYKpA¬YO←e%iQ[f↓S\A→%' \~(~∀JdL\JT@↓∪hAo%YXAG1KCeYdAgQ←\AiQJ↓eKYCQS←]g!S`AE∃ioKK8AG←[ASYCi%←\AC9H~∃KYCYkCQS←\\A)QCPASfX↓iQJA1∪' A→k]Gi%←\Ae∃aeKg∃]iS]≤AiQJ↓G←[a%YKd~)oSYX↓mKer↓GY←g∃YrAa¬eCYY∃XAiQ∀AgieUGike∀A←LAQQJAKYCYkCQ←dX@∀gKmC0JT\~)∪LAs=jAk]⊃Kegi¬]H@JMKmCX∀TXAi!K\Ai!JAG←5aSYKHASfA∃Cgr\4∀~∀JHh\JTAmKIs←]J↓gQ←k1HAgK∀ABAG=[aSY∃d@PD∀e'QkPAk`B∀TDXA!JAKqAYCS]∃HR\~(~∀~∀4∀_]'& KC'~∀TuαAMS[aY∀A[CG!S]JX∃C'~J(Y ffhR~∃)!SfAg∃GiS←8@AIKMGeSE∃fABAMS[aY∀A[CG!S]JA]QSGP4∃QCf↓BAgk→MSGS∃]hAS9giek
iS←\↓gKhAQ↑AQC9IYJA¬\AS[AYK[K9iCiS=\A←L↓→∪' 8~∃¬K→←eJA]JAEK≥S\XA9←iJAQQChAQQSfA5CGQS9JASfJe]←PJTA]∃GKgg¬erAM=dA←kH~∃k]⊃Kegi¬]IS]≤A←L@∀gKmC0JT\@∀gKmC0JTASLAgKY_[G←]QCS]K⊂\A/J↓]KKH↓←]Yr↓IKgGISEJ~)BA[C
QS]J↓i↑AI%gGkgLAS[a1K[K]QCiS←8AC]H↓G←[a%YCiS=\\@AQQSfX4∃S]I∃KHASLAC\A=EUKGQS←\AQ↑AIKMGeSE%]NA[∃C]S]≤A←LAAe←Oe¬[[S]≤AYC]≥kCOKL~∃S\↓iKe[LA←LA∧AG←[ASYKdZZAs=jA[kMhAk]⊃Kegi¬]H@JIio↑J(AiQS9OfA]=n@ZZ↓iQJ~)YC]OUCOJ@∀eC]H∀TABA5CGQS9J\~∀4∃)QJ↓gS[a1JA[C
QS]J0@KC'4JTXA!CfAB↓gYSO!hAgS5SYCe%irAi<AiQJ↓←eOC9SuCi%←\A←_AiQJ4∃! 4b`\A]JA]K∃HAmKIrAMK\AMKCQkeKf↓i↑AC⊃KckCQKYrA⊃SgGkMfAiQ∀AS]i∃eKgi%]N@~)MCGKQfA←L↓S[aY∃[K]i¬iS←\4∃←LA=kdA→%' AgUEgKh8AπKeQCS]YdXASL↓oJAo∃eJAi<Ak]I∃eiCW∀ABAe∃CXAS5aYK[∃]iCi%←\XA5C]r~)[←eJ↓S]giIkGiS=]fAo=kYHA JA]K
KggCIr\A'%[SYCIYrXA]QK\A]JAISMGkgf↓G←[a%YCiS=\~∃←Ud@KCM~JTAMkMMS
KfXA khAS_AoJA]SgQK⊂Ai↑AAKeM←IZ@Je∃MMSG%K]hJ(AG←[ASYCi%←\~∃]JAo←UYHAQ=aKMk1YrAQ¬mJAB↓EKii∃dAS]MiekGQS←\AMKh\@↓)QJAA←S]h↓QKeJ4∃SfAQ↑Ak]⊃Kegi¬]HAE¬gSFA¬YO←e%iQ[f8A∪LAQQChA%fACG
←[aY%gQKH↓ShASLAckSQJ~∃gQeCSO!iM←e]CeHAQ↑AKq¬[S]J↓ae←E1K[fA=LAKM→SGSK9GrXA¬]HAI∃iCSYLA←LA%[aYK5K]iCQS←\\4∀~∀K¬'~JT↓QCfA∧AG←]YK]iS=]CXA¬IIeKMgCEY∀A[CS8A[K[=erXA¬]HA\↓gaKG%CXAe∃OSgi∃efX~)βεbX↓βεdX\\\X↓βπ\\↓)QKg∀AeKO%giKeLACeJ↓GCYY∃H@Je¬GGk[UYCi←IfJT~)C]HA¬fAoSQPAiQ∀Aβ≠¬%(←∞A⊃KgGe%aiS←8XAiQ∃gJA\↓eKOSMiKef↓CeJAQ↑AG←9iCS\↓a←S]QKef~)i↑Ai!JACe≥k[K]QfA←L↓BAMk9GiS←8AChAQQJAi%[JA←_AS]m=GCiS=\\~∃∃CGPA5K[←edAY←G¬iS←\↓←dAe∃OSgi∃dASf↓Cggk5KHAi<AEJA1CeOJ↓K]←k≥PAi↑↓G←]i¬S\~∃Qo↑AC⊃IeKgMKf\A→←dAg¬WJA←_AISg
kggS=\XACMgk[J↓iQJA]←eHAMSuJA%f@fl↓ESif8~∃βgMk[JAQQJAC⊃IeKgMS]NAMaCGJ↓SfAi!K\@d∀pbpJ(\@~∃QQJA[¬aaS]≤A←LA∧@Jo6h45:4h5:JT↓←]i↑↓B@KCM~JTA1←GCi%←\ASLAKCgdvAiQ∀@JgG¬dJT~)[Caf↓i↑Ai!JAYK→h[QC1LA←L↓iQJA]←eHv↓iQJ@∀gGId∀TXAi<AiQJ↓eSOQP\~∃α↓[K[←IrACe∃BAi↑↓G←]i¬S\AMUYX[o=eIf@ JnA6h4444i:JT@$ASfA⊃KgSO9CiKH8@~∃β1XA`[9C[Kf↓C]HA9k[EKIfACe∀Agi←IKHAQ∃eJ\~(~∃!CIifA←_@KC'4JTA[∃[←er↓GC\A JAIKMSO]CQKHACLAgiC
Wf\A∃CGPAMiCGV↓SfAB↓G←]i%Ok←kL~∃Ce∃BA←L↓[K[←IrXAC9HAiQ∀AGkeIK]hAQ←`A←_AiQJ↓giCG,ASfAIKMKe∃]GKHAErA∧AOK]∃eCX~)eKOSMiKdX↓ bX@8\\XAA\XAG¬YYKHa %2stack-%* or %2pushdown-%* pointer.
The stacks will be used to contain the partial results of calculations and
will contain the information necessary to implement recursive function calling.
The primary LISP stack is named P.
There are only three main classes of instructions necessary to describe
LISP implementation: instructions for constant generation, instructions for
stack manipulation, and instructions for control of flow.
The control instructions and some of the stack instructions refer to the
program counter of %aSM%*. %aPC%* designates this counter. %3C%* in the following
means "contents of...".
.BEGIN TABIT1(19);TURN OFF "←";
MOVEI ACi const\%3C%1(ACi) ← const
PUSH P ACj\%3C%*(%3C%*(P)) ← %3C%*(ACj)
\%3C%*(P) ← %3C%*(P)+1. Push contents of ACj onto top of stack.
POP P ACj\%3C%*(P) ← %3C%*(P)-1
\%3C%*(ACj) ← %3C%1(%3C%*(P)). Pop top of stack into ACj.
POPJ P\P ← %3C%*(%3C%*(P))-1
\%3C%*(%aPC%*) ← %3C%*(%3C%*(P)). Pop top of stack into %aPC%*; used as return.
.BEGIN INDENT 0,19;FILL;
CALL i fn\This instruction handles function calling. i is the number of arguments
(assumed to be in AC1 through ACi at time of call). fn represents a function name.
One effect of CALL is to leave return information on the stack, P.
.END
MOVE ACi -j P\%3C%*(ACi) ← copy of the j%8th%* element (counting from zero%8th%*) of stack P.
MOVEM ACi -j P\j%8th%* element of stack P is replaced by contents of ACi.
.BEGIN INDENT 0,19;FILL;
SUB x y\%3C%*(x) ← %3C%*(x) - %3C%*(y); Used in the context, SUB P const, to remove
a block of cells from the top of the stack.
.END
JRST j\%3C%*(%aPC%*) ← j. Go to loc j.
JUMPE ACi j\%2if%* %3C%*(ACi)=0 %2then%* %3C%*(%aPC%*) ← j;
JUMPN ACi j\%2if%* %3C%*(ACi)%c≠%*0 %2then%* %3C%*(%aPC%*) ← j;
.END
For much of the sequel, the only calling sequence of interest will be
ordinary function calls. The evaluated arguments are to be loaded into
the accumulators.
Internal λ-expressions are not handled yet. (See problem IV on {yon(P36)}).
More instructions for %aSM%* will be given on {yon(P35)} when we
discuss efficiency.
.SS(An alternative: deep bindings,deep binding)
There are some weaknesses in the model of implementation espoused
in the previous sections. First, as we have seen ⊗→shallow binding↔← requires
that we be a bit careful in handling functional arguments.
Also there is an appreciable overhead in changing contexts.
Second, our strategy of requiring argument passing in a fixed number of
registers, AC%41%* through AC%4n%*, is unnecessarily restrictive. There
will be some arbitrary limit on the number of arguments which a function
may have. After a moment's reflection on the representation of the symbol table
as a stack {yon(P43)}, it should be apparent that we might try passing the
arguments to a function requiring %3n%* arguments as the first %3n%* elements
of a stack. The value returned by the function could be defined to be the
value located in the top of the stack. This holds promise. Consider the
evaluation of a form
.BEGIN CENTERIT;SELECT 3;
←f[g%41%*[ ... ]; ... g%4n%*[ ... ]] .
.END
The evaluation of the arguments is to proceed from left to right. If we perform
the procedure "calculate the argument; leave value in top of stack", then %3f%*
could expect to find values for its λ-variables as:
.BEGIN TABIT3(20,40,60);GROUP;
\value stack:\| value for %3x%4n%1\|
\\| value for %3x%4n-1%1\|
\\| ... ... \|
\\| value for %3x%41%1\|
.END
There will be no problem about searching to find the values; %3f%* "knows"
where to find the values in the stack. When %3f%* is finished its computation
it need only remove the top n elements from the value stack and add the value
which it computed.
This scheme seems to have the advantage of shallow binding: fast access
to values, but none of the disadvantages. It looks like the a-list scheme
for binding and unbinding. What's the problem? It's global variables.
If %3f%* wants to access a global variable how can it be found? All
we have stored on the stack is the %2value%* and there is no way that %3f%*
can "know" %2which%* value is the value of the global variable.
Surely we can solve this problem: simply store %2pairs%*##---##name-value pairs.
So what's the problem now? The expensive calculation only arises when we
access global variables. Most variables %2are%* local aren't they ---
or are they? Function names are variables and are almost always global;
%3car%* is a variable name, whose "value" is a primitive routine to
compute the %3car%* function. Seldom do you wish to redefine %3car%*.
Searching the stack for name-value pairs %2is%* more expensive than
picking up the value cell of the shallow binding scheme. We will soon see
that LISP compilers for deep binding %2can%* be made efficient
in their access of local and global variables; but the interpreter
%2will%* have to search. One of the worst cases will be the execution of a loop,
in which accesses to the same variables occur frequently.
This, perhaps is another good reason for removing control of iteration
from the hands of the programmer. The extent of (or even the presence of) a loop
which the user is
controlling by tests and goto's is difficult to discover. If a loop
is controlled by language constructs (WHILE, REPEAT, etc.) then the
interpreter might have some chance of improving the execution of the loop.
As we have previously said, deep binding is very reminiscent of the a-list
symbol table structure which was first a stack, then embellished to become
a tree when functional arguments appeared. But recall that the a-list scheme
had each name-value pair chained to its successor. Pointers into the table
structure (environment pointers) entered into this chain at various placees
to define an environment. With the a-list scheme, we were able to generate a
tree structure, but deep binding, so far, still has a stack-like structure.
Something must be done. Let's look at the a-list %3eval%* again.
First note that the environment pointers we generated for handling
functional arguments point into the symbol tables in very specific places;
they %2always%* point into the tables at the beginning of λ-bindings. They will
%2never%* point into the interior of a block of bindings. Thus each
%2block%* of λ-bindings can go onto a stack in a contiguous fashion.
Now recall the Weizenbaum environments: each environment has a local
symbol table (λ-variables, and %3prog%*-variables); each environment
also has entries for E%4a%* and E%4c%*. So we may simulate a tree on the
name-value stack if we include in each block of bindings, pointers
representing E%4c%* and E%4a%*. Thus the search strategy becomes yet a bit
more complex: when seeking a value for a variable run down the name-value
stack comparing names; when the block is exhausted we follow the E%4a%*-pointer.
To exit a block is easy and in fact much easier than the shallow binding
ritual: simply restore the E%4c%*-pointer.
***more,more***
.END "JRA";